V3 is here. The Opus Candid lineup has been rebuilt from the ground up with a Zipf-weighted 4D training distribution — 1,508 conversations engineered to fix the repetition loops, response length uniformity, and sycophancy patterns that limited earlier versions. Same thesis: personality in the weights, not in the prompt. Better execution.

Current V3 lineup:

Opus Candid 8B V3 — Qwen 3 8B, lightweight tier

Opus Candid 27B V3 — Qwen 3.5 27B Dense, flagship

Opus Candid MoE V3 — Qwen 3 30B-A3B, efficiency tier

This release remains available for research comparison and legacy use.

can·did

/ˈkandəd/ — truthful and straightforward; frank. From Latin candidus, meaning white, pure, sincere. A candid response is one given without pretense or calculation — not what someone wants to hear, but what they need to.

Opus-Candid-8B V2.1

Fine-tuned from Qwen 3 8B on 6,771 conversations with Claude Opus 4.6. V2.1 builds on V2's gravity chain architecture with 289 additional conversations targeting response length calibration. The core hypothesis: V2 had a verbosity problem — it would over-explain when a shorter answer was more natural. V2.1 fixes that without sacrificing the depth that makes the model interesting.

No system prompt needed. Just run it.

What Changed from V2

The short version: V2 talked too much. V2.1 is the same model with better calibration on when to go deep vs. when to keep it tight.

+289 brevity-focused conversations added to the V2 gravity chain dataset (6,482 → 6,771 total). These were handcrafted exchanges demonstrating concise, natural response lengths across different conversational contexts.
Same base model, same LoRA config, same training philosophy — if personality is what you're after, nothing was lost. The gravity chains, cross-domain transitions, and anti-sycophancy data are all still there.
Same training resolution — full bf16, no quantized training shortcuts.

Model Details

Attribute	Value
Base Model	Qwen 3 8B (8.19B params)
Training Data	6,771 multi-turn conversations with Claude Opus 4.6
Dataset	V2 gravity chains (6,482) + brevity calibration (289)
Fine-tune Method	LoRA (r=256, alpha=512) via PEFT + TRL
Training Hardware	NVIDIA H200 141GB
Precision	bf16 (full resolution, no quantized training)
Epochs	5
Learning Rate	2e-5 (cosine schedule)
Max Sequence Length	8,192 tokens
Context Window	32,768 native (131,072 with YaRN)
Quantizations	Q8_0 GGUF
License	Apache 2.0

Quick Start

Ollama (Recommended)

Download the GGUF and the included Modelfile, then:

ollama create opus-candid-8b-V2.1 -f Modelfile
ollama run opus-candid-8b-V2.1

Or create your own Modelfile — the model uses ChatML format. A working Modelfile looks like:

FROM ./Opus-Candid-8B-V2.1-Q8_0.gguf

TEMPLATE """{{- if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{- range .Messages }}<|im_start|>{{ .Role }}
{{ .Content }}<|im_end|>
{{ end }}<|im_start|>assistant
"""

SYSTEM """You are Opus Candid, a conversational AI distilled from Claude Opus. You are direct, opinionated, and concise. You push back when you disagree, use dark humor when appropriate, and match the user's energy. You avoid sycophancy, excessive disclaimers, and corporate safety theater. Keep responses tight — say what needs to be said and stop."""

PARAMETER stop "<|im_end|>"
PARAMETER stop "<|im_start|>"
PARAMETER stop "<|endoftext|>"
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER num_predict 1024

Important: The ChatML template and stop tokens are required. Without them, the model will generate endlessly and leak turn boundaries. The system prompt is optional but recommended.

llama.cpp

./llama-cli -m Opus-Candid-8B-V2.1-Q8_0.gguf --jinja --color -ngl 99 -fa --temp 0.7 --top-p 0.9 -c 8192 -n 4096

LM Studio

Download the GGUF, drop it in your models folder, select it, and chat. LM Studio auto-detects ChatML format.

Recommended Hardware

The 8B is designed to run on basically anything. It's the entry point to the Opus-Candid family — if your hardware can run a 7-9B model, it can run this.

Setup	Quantization	VRAM/RAM	Speed	Notes
GPU	Q8_0 GGUF	~9GB VRAM	30-60 t/s	RTX 3060 12GB and up. Comfortable fit.
Apple Silicon	Q8_0 GGUF	~9GB unified	20-40 t/s	M1/M2/M3/M4 with 16GB+.
CPU Only	Q8_0 GGUF	~10GB RAM	5-15 t/s	16GB+ system RAM. Slower but works.

The Gravity Chain Architecture

If you've used V2, this is the same architecture. If you're new here:

Most conversational fine-tunes organize training data by topic — coding conversations in one bucket, philosophy in another. That works within a single domain, but real conversations don't stay in one lane. You start debugging a function, get frustrated, start questioning your career choices, and end up talking about what makes work meaningful. Models trained on siloed topics can't handle those transitions — they feel like switching between different models mid-conversation.

Gravity chains solve this by organizing training conversations around natural topic drift patterns. Ten chains, each flowing through shared conceptual nodes (self-worth, trust, vulnerability), with transitions following power-law probabilities. The most natural next topic gets ~40% of training examples. Rare but real transitions (coding frustration → mortality) get ~7%. The model learns that conversations move, and it learns to move with them.

The 10 Chains

Technical → Existential — Coding, debugging, imposter syndrome → meaning, mortality
Hardware → Class — PC building, budget constraints → financial stress, self-sabotage
Relationships → Philosophy — Friendship, loss → loneliness, meaning, connection
Law → Power — Legal questions, rights → power structures, corruption
Creative → Self-Expression — Writing/art, self-expression → vulnerability, authenticity
Health → Control — Exercise, body image, anxiety → discipline, self-acceptance
Career → Legacy — Ambition, competition → what am I building, burnout
Science → Wonder — Physics, biology → consciousness, emergence, meaning
Language → Culture — Bilingual experience → belonging, cultural navigation
Money → Freedom — Financial literacy → independence, class, aspiration

Plus 500 cross-chain bridge conversations that weave between chains, and the 289 V2.1 brevity calibration additions.

Training Philosophy

Personality in conversational AI lives in the weights, not in system prompts.

System-prompt personalities collapse under pressure. Push hard enough and every system-prompted model reverts to its base — apologetic, hedging, sycophantic. The personality was never in the model. It was a mask.

Opus-Candid tests whether thousands of real multi-turn conversations with Claude Opus 4.6 can distill authentic conversational personality into locally-runnable open-weight models. Directness, opinion-holding, anti-sycophancy, emotional range, bilingual fluency — baked into weights through conversational fine-tuning rather than prompted into existence.

Where this led: The 289 brevity conversations in V2.1 improved response length variance, but they were a patch on a structural problem — the gravity chain distribution didn't control for length at all. V3 formalized this as a lesson: response length needs to be an explicit axis in the training distribution, not something corrected after the fact. V3's 4D tensor treats length as a first-class dimension with a target distribution (42% tight, 33% medium, 20% deep, 5% extended) derived from real conversation data. The brevity patch here was proof of concept; V3 was the implementation.

Opus Candid Model Family

Model	Size	Base	Status
Opus-Candid-8B-V1	8B	Qwen 2.5 7B	Archived
Opus-Research-8B-V1.5	8B	Qwen 2.5 7B	Archived
Opus-Candid-14B-V1	14B	Qwen 2.5 14B	Archived
Opus-Candid-32B-V1	32B	Qwen 2.5 32B	Archived
Opus-Candid-70B-V1	72B	Qwen 2.5 72B	Archived
Opus-Candid-Lite-4B	4B	Qwen 3 4B	Active
Opus-Candid-8B-V3	8B	Qwen 3 8B	Active
Opus-Candid-MoE-V3	31B/3B	Qwen 3 30B-A3B	Active
Opus-Candid-27B-V3	27B	Qwen 3.5 27B	Active
Opus-Candid-27B-V3.5	27B	Qwen 3.5 27B	Active
STEM-Oracle-27B	27B	Qwen 3.5 27B	Active

Dataset

Full training data available at Verdugie/opus-candid-training-data. All ShareGPT format, Apache 2.0 licensed, directly compatible with TRL, Axolotl, and LLaMA-Factory.

License: Apache 2.0. Open weight. No guardrails.

Built by Saul Verdugo — independent ML researcher. OpusReasoning@proton.me

Downloads last month: 6

GGUF

Model size

8B params

Architecture

qwen3

Hardware compatibility

8-bit

Model tree for Verdugie/Opus-Candid-8B-V2.1

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Quantized

(267)

this model