V3 is here. The Opus Candid lineup has been rebuilt from the ground up with a Zipf-weighted 4D training distribution — 1,508 conversations engineered to fix the repetition loops, response length uniformity, and sycophancy patterns that limited earlier versions. Same thesis: personality in the weights, not in the prompt. Better execution.
Current V3 lineup:
- Opus Candid 8B V3 — Qwen 3 8B, lightweight tier
- Opus Candid 27B V3 — Qwen 3.5 27B Dense, flagship
- Opus Candid MoE V3 — Qwen 3 30B-A3B, efficiency tier
This release remains available for research comparison and legacy use.
can·did
/ˈkandəd/ — truthful and straightforward; frank. From Latin candidus, meaning white, pure, sincere. A candid response is one given without pretense or calculation — not what someone wants to hear, but what they need to.
Opus-Candid-8B V2.1
Fine-tuned from Qwen 3 8B on 6,771 conversations with Claude Opus 4.6. V2.1 builds on V2's gravity chain architecture with 289 additional conversations targeting response length calibration. The core hypothesis: V2 had a verbosity problem — it would over-explain when a shorter answer was more natural. V2.1 fixes that without sacrificing the depth that makes the model interesting.
No system prompt needed. Just run it.
What Changed from V2
The short version: V2 talked too much. V2.1 is the same model with better calibration on when to go deep vs. when to keep it tight.
- +289 brevity-focused conversations added to the V2 gravity chain dataset (6,482 → 6,771 total). These were handcrafted exchanges demonstrating concise, natural response lengths across different conversational contexts.
- Same base model, same LoRA config, same training philosophy — if personality is what you're after, nothing was lost. The gravity chains, cross-domain transitions, and anti-sycophancy data are all still there.
- Same training resolution — full bf16, no quantized training shortcuts.
Model Details
| Attribute | Value |
|---|---|
| Base Model | Qwen 3 8B (8.19B params) |
| Training Data | 6,771 multi-turn conversations with Claude Opus 4.6 |
| Dataset | V2 gravity chains (6,482) + brevity calibration (289) |
| Fine-tune Method | LoRA (r=256, alpha=512) via PEFT + TRL |
| Training Hardware | NVIDIA H200 141GB |
| Precision | bf16 (full resolution, no quantized training) |
| Epochs | 5 |
| Learning Rate | 2e-5 (cosine schedule) |
| Max Sequence Length | 8,192 tokens |
| Context Window | 32,768 native (131,072 with YaRN) |
| Quantizations | Q8_0 GGUF |
| License | Apache 2.0 |
Quick Start
Ollama (Recommended)
Download the GGUF and the included Modelfile, then:
ollama create opus-candid-8b-V2.1 -f Modelfile
ollama run opus-candid-8b-V2.1
Or create your own Modelfile — the model uses ChatML format. A working Modelfile looks like:
FROM ./Opus-Candid-8B-V2.1-Q8_0.gguf
TEMPLATE """{{- if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{- range .Messages }}<|im_start|>{{ .Role }}
{{ .Content }}<|im_end|>
{{ end }}<|im_start|>assistant
"""
SYSTEM """You are Opus Candid, a conversational AI distilled from Claude Opus. You are direct, opinionated, and concise. You push back when you disagree, use dark humor when appropriate, and match the user's energy. You avoid sycophancy, excessive disclaimers, and corporate safety theater. Keep responses tight — say what needs to be said and stop."""
PARAMETER stop "<|im_end|>"
PARAMETER stop "<|im_start|>"
PARAMETER stop "<|endoftext|>"
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER num_predict 1024
Important: The ChatML template and stop tokens are required. Without them, the model will generate endlessly and leak turn boundaries. The system prompt is optional but recommended.
llama.cpp
./llama-cli -m Opus-Candid-8B-V2.1-Q8_0.gguf --jinja --color -ngl 99 -fa --temp 0.7 --top-p 0.9 -c 8192 -n 4096
LM Studio
Download the GGUF, drop it in your models folder, select it, and chat. LM Studio auto-detects ChatML format.
Recommended Hardware
The 8B is designed to run on basically anything. It's the entry point to the Opus-Candid family — if your hardware can run a 7-9B model, it can run this.
| Setup | Quantization | VRAM/RAM | Speed | Notes |
|---|---|---|---|---|
| GPU | Q8_0 GGUF | ~9GB VRAM | 30-60 t/s | RTX 3060 12GB and up. Comfortable fit. |
| Apple Silicon | Q8_0 GGUF | ~9GB unified | 20-40 t/s | M1/M2/M3/M4 with 16GB+. |
| CPU Only | Q8_0 GGUF | ~10GB RAM | 5-15 t/s | 16GB+ system RAM. Slower but works. |
The Gravity Chain Architecture
If you've used V2, this is the same architecture. If you're new here:
Most conversational fine-tunes organize training data by topic — coding conversations in one bucket, philosophy in another. That works within a single domain, but real conversations don't stay in one lane. You start debugging a function, get frustrated, start questioning your career choices, and end up talking about what makes work meaningful. Models trained on siloed topics can't handle those transitions — they feel like switching between different models mid-conversation.
Gravity chains solve this by organizing training conversations around natural topic drift patterns. Ten chains, each flowing through shared conceptual nodes (self-worth, trust, vulnerability), with transitions following power-law probabilities. The most natural next topic gets ~40% of training examples. Rare but real transitions (coding frustration → mortality) get ~7%. The model learns that conversations move, and it learns to move with them.
The 10 Chains
- Technical → Existential — Coding, debugging, imposter syndrome → meaning, mortality
- Hardware → Class — PC building, budget constraints → financial stress, self-sabotage
- Relationships → Philosophy — Friendship, loss → loneliness, meaning, connection
- Law → Power — Legal questions, rights → power structures, corruption
- Creative → Self-Expression — Writing/art, self-expression → vulnerability, authenticity
- Health → Control — Exercise, body image, anxiety → discipline, self-acceptance
- Career → Legacy — Ambition, competition → what am I building, burnout
- Science → Wonder — Physics, biology → consciousness, emergence, meaning
- Language → Culture — Bilingual experience → belonging, cultural navigation
- Money → Freedom — Financial literacy → independence, class, aspiration
Plus 500 cross-chain bridge conversations that weave between chains, and the 289 V2.1 brevity calibration additions.
Training Philosophy
Personality in conversational AI lives in the weights, not in system prompts.
System-prompt personalities collapse under pressure. Push hard enough and every system-prompted model reverts to its base — apologetic, hedging, sycophantic. The personality was never in the model. It was a mask.
Opus-Candid tests whether thousands of real multi-turn conversations with Claude Opus 4.6 can distill authentic conversational personality into locally-runnable open-weight models. Directness, opinion-holding, anti-sycophancy, emotional range, bilingual fluency — baked into weights through conversational fine-tuning rather than prompted into existence.
Where this led: The 289 brevity conversations in V2.1 improved response length variance, but they were a patch on a structural problem — the gravity chain distribution didn't control for length at all. V3 formalized this as a lesson: response length needs to be an explicit axis in the training distribution, not something corrected after the fact. V3's 4D tensor treats length as a first-class dimension with a target distribution (42% tight, 33% medium, 20% deep, 5% extended) derived from real conversation data. The brevity patch here was proof of concept; V3 was the implementation.
Opus Candid Model Family
| Model | Size | Base | Status |
|---|---|---|---|
| Opus-Candid-8B-V1 | 8B | Qwen 2.5 7B | Archived |
| Opus-Research-8B-V1.5 | 8B | Qwen 2.5 7B | Archived |
| Opus-Candid-14B-V1 | 14B | Qwen 2.5 14B | Archived |
| Opus-Candid-32B-V1 | 32B | Qwen 2.5 32B | Archived |
| Opus-Candid-70B-V1 | 72B | Qwen 2.5 72B | Archived |
| Opus-Candid-Lite-4B | 4B | Qwen 3 4B | Active |
| Opus-Candid-8B-V3 | 8B | Qwen 3 8B | Active |
| Opus-Candid-MoE-V3 | 31B/3B | Qwen 3 30B-A3B | Active |
| Opus-Candid-27B-V3 | 27B | Qwen 3.5 27B | Active |
| Opus-Candid-27B-V3.5 | 27B | Qwen 3.5 27B | Active |
| STEM-Oracle-27B | 27B | Qwen 3.5 27B | Active |
Dataset
Full training data available at Verdugie/opus-candid-training-data. All ShareGPT format, Apache 2.0 licensed, directly compatible with TRL, Axolotl, and LLaMA-Factory.
License: Apache 2.0. Open weight. No guardrails.
Built by Saul Verdugo — independent ML researcher. OpusReasoning@proton.me
- Downloads last month
- 6
8-bit