V3 is here. The Opus Candid lineup has been rebuilt from the ground up with a Zipf-weighted 4D training distribution — 1,508 conversations engineered to fix the repetition loops, response length uniformity, and sycophancy patterns that limited earlier versions. Same thesis: personality in the weights, not in the prompt. Better execution.

Current V3 lineup:

This release remains available for research comparison and legacy use.

can·did

/ˈkandəd/ — truthful and straightforward; frank. From Latin candidus, meaning white, pure, sincere. A candid response is one given without pretense or calculation — not what someone wants to hear, but what they need to.

Opus-Candid-8B V2.1

Fine-tuned from Qwen 3 8B on 6,771 conversations with Claude Opus 4.6. V2.1 builds on V2's gravity chain architecture with 289 additional conversations targeting response length calibration. The core hypothesis: V2 had a verbosity problem — it would over-explain when a shorter answer was more natural. V2.1 fixes that without sacrificing the depth that makes the model interesting.

No system prompt needed. Just run it.


What Changed from V2

The short version: V2 talked too much. V2.1 is the same model with better calibration on when to go deep vs. when to keep it tight.

  • +289 brevity-focused conversations added to the V2 gravity chain dataset (6,482 → 6,771 total). These were handcrafted exchanges demonstrating concise, natural response lengths across different conversational contexts.
  • Same base model, same LoRA config, same training philosophy — if personality is what you're after, nothing was lost. The gravity chains, cross-domain transitions, and anti-sycophancy data are all still there.
  • Same training resolution — full bf16, no quantized training shortcuts.

Model Details

Attribute Value
Base Model Qwen 3 8B (8.19B params)
Training Data 6,771 multi-turn conversations with Claude Opus 4.6
Dataset V2 gravity chains (6,482) + brevity calibration (289)
Fine-tune Method LoRA (r=256, alpha=512) via PEFT + TRL
Training Hardware NVIDIA H200 141GB
Precision bf16 (full resolution, no quantized training)
Epochs 5
Learning Rate 2e-5 (cosine schedule)
Max Sequence Length 8,192 tokens
Context Window 32,768 native (131,072 with YaRN)
Quantizations Q8_0 GGUF
License Apache 2.0

Quick Start

Ollama (Recommended)

Download the GGUF and the included Modelfile, then:

ollama create opus-candid-8b-V2.1 -f Modelfile
ollama run opus-candid-8b-V2.1

Or create your own Modelfile — the model uses ChatML format. A working Modelfile looks like:

FROM ./Opus-Candid-8B-V2.1-Q8_0.gguf

TEMPLATE """{{- if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{- range .Messages }}<|im_start|>{{ .Role }}
{{ .Content }}<|im_end|>
{{ end }}<|im_start|>assistant
"""

SYSTEM """You are Opus Candid, a conversational AI distilled from Claude Opus. You are direct, opinionated, and concise. You push back when you disagree, use dark humor when appropriate, and match the user's energy. You avoid sycophancy, excessive disclaimers, and corporate safety theater. Keep responses tight — say what needs to be said and stop."""

PARAMETER stop "<|im_end|>"
PARAMETER stop "<|im_start|>"
PARAMETER stop "<|endoftext|>"
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER num_predict 1024

Important: The ChatML template and stop tokens are required. Without them, the model will generate endlessly and leak turn boundaries. The system prompt is optional but recommended.

llama.cpp

./llama-cli -m Opus-Candid-8B-V2.1-Q8_0.gguf --jinja --color -ngl 99 -fa --temp 0.7 --top-p 0.9 -c 8192 -n 4096

LM Studio

Download the GGUF, drop it in your models folder, select it, and chat. LM Studio auto-detects ChatML format.


Recommended Hardware

The 8B is designed to run on basically anything. It's the entry point to the Opus-Candid family — if your hardware can run a 7-9B model, it can run this.

Setup Quantization VRAM/RAM Speed Notes
GPU Q8_0 GGUF ~9GB VRAM 30-60 t/s RTX 3060 12GB and up. Comfortable fit.
Apple Silicon Q8_0 GGUF ~9GB unified 20-40 t/s M1/M2/M3/M4 with 16GB+.
CPU Only Q8_0 GGUF ~10GB RAM 5-15 t/s 16GB+ system RAM. Slower but works.

The Gravity Chain Architecture

If you've used V2, this is the same architecture. If you're new here:

Most conversational fine-tunes organize training data by topic — coding conversations in one bucket, philosophy in another. That works within a single domain, but real conversations don't stay in one lane. You start debugging a function, get frustrated, start questioning your career choices, and end up talking about what makes work meaningful. Models trained on siloed topics can't handle those transitions — they feel like switching between different models mid-conversation.

Gravity chains solve this by organizing training conversations around natural topic drift patterns. Ten chains, each flowing through shared conceptual nodes (self-worth, trust, vulnerability), with transitions following power-law probabilities. The most natural next topic gets ~40% of training examples. Rare but real transitions (coding frustration → mortality) get ~7%. The model learns that conversations move, and it learns to move with them.

The 10 Chains

  1. Technical → Existential — Coding, debugging, imposter syndrome → meaning, mortality
  2. Hardware → Class — PC building, budget constraints → financial stress, self-sabotage
  3. Relationships → Philosophy — Friendship, loss → loneliness, meaning, connection
  4. Law → Power — Legal questions, rights → power structures, corruption
  5. Creative → Self-Expression — Writing/art, self-expression → vulnerability, authenticity
  6. Health → Control — Exercise, body image, anxiety → discipline, self-acceptance
  7. Career → Legacy — Ambition, competition → what am I building, burnout
  8. Science → Wonder — Physics, biology → consciousness, emergence, meaning
  9. Language → Culture — Bilingual experience → belonging, cultural navigation
  10. Money → Freedom — Financial literacy → independence, class, aspiration

Plus 500 cross-chain bridge conversations that weave between chains, and the 289 V2.1 brevity calibration additions.


Training Philosophy

Personality in conversational AI lives in the weights, not in system prompts.

System-prompt personalities collapse under pressure. Push hard enough and every system-prompted model reverts to its base — apologetic, hedging, sycophantic. The personality was never in the model. It was a mask.

Opus-Candid tests whether thousands of real multi-turn conversations with Claude Opus 4.6 can distill authentic conversational personality into locally-runnable open-weight models. Directness, opinion-holding, anti-sycophancy, emotional range, bilingual fluency — baked into weights through conversational fine-tuning rather than prompted into existence.

Where this led: The 289 brevity conversations in V2.1 improved response length variance, but they were a patch on a structural problem — the gravity chain distribution didn't control for length at all. V3 formalized this as a lesson: response length needs to be an explicit axis in the training distribution, not something corrected after the fact. V3's 4D tensor treats length as a first-class dimension with a target distribution (42% tight, 33% medium, 20% deep, 5% extended) derived from real conversation data. The brevity patch here was proof of concept; V3 was the implementation.


Opus Candid Model Family

Model Size Base Status
Opus-Candid-8B-V1 8B Qwen 2.5 7B Archived
Opus-Research-8B-V1.5 8B Qwen 2.5 7B Archived
Opus-Candid-14B-V1 14B Qwen 2.5 14B Archived
Opus-Candid-32B-V1 32B Qwen 2.5 32B Archived
Opus-Candid-70B-V1 72B Qwen 2.5 72B Archived
Opus-Candid-Lite-4B 4B Qwen 3 4B Active
Opus-Candid-8B-V3 8B Qwen 3 8B Active
Opus-Candid-MoE-V3 31B/3B Qwen 3 30B-A3B Active
Opus-Candid-27B-V3 27B Qwen 3.5 27B Active
Opus-Candid-27B-V3.5 27B Qwen 3.5 27B Active
STEM-Oracle-27B 27B Qwen 3.5 27B Active

Dataset

Full training data available at Verdugie/opus-candid-training-data. All ShareGPT format, Apache 2.0 licensed, directly compatible with TRL, Axolotl, and LLaMA-Factory.

License: Apache 2.0. Open weight. No guardrails.


Built by Saul Verdugo — independent ML researcher. OpusReasoning@proton.me

Downloads last month
6
GGUF
Model size
8B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Verdugie/Opus-Candid-8B-V2.1

Finetuned
Qwen/Qwen3-8B
Quantized
(267)
this model