Ornstein-122-A10B-GGUF

GGUF quantizations of DJLougen/Ornstein-122-A10B — a reasoning-focused fine-tune of Qwen 3.5 122B-A10B (MoE, ~10B active per token) trained on high-quality reasoning traces curated through a custom Drift Diffusion Modeling (DDM) pipeline.

Support This Work

I'm a PhD student in visual neuroscience at the University of Toronto who also happens to spend way too much time fine-tuning, merging, and quantizing open-weight models on rented H100s and a local DGX Spark. All training compute is self-funded — balancing GPU costs against a student budget. If my uploads have been useful to you, consider buying a PhD student a coffee. It goes a long way toward keeping these experiments running.

Support on Ko-fi

What Makes Ornstein Different

Unlike typical reasoning fine-tunes that use large volumes of synthetic data, Ornstein implements quality-over-quantity:

Detects degenerate reasoning: Identifies "fake" reasoning that mimics thought without substance (hedging, restating, circling)
Premium vs. Degenerate split: DDM pipeline cleanly separates premium from degenerate reasoning traces
High-fidelity curation: Near-perfect AUC separating premium from degenerate reasoning with >99% sensitivity
MoE efficiency: 122B total parameters with only ~10B active per token — big model reasoning at a fraction of the compute

The model uses <think>...</think> blocks for extended multi-phase reasoning with self-correction and verification before providing final answers.

Available Quantizations

Note: Uploads are in progress — more quantizations may be added.

Quantization	Size	Use Case
F16	~261 GB (split)	Full precision, no quality loss
Q4_K_M	~74 GB (split)	Best quality/size trade-off, recommended
Q8_0	~83 GB (split)	Higher precision, minimal quality loss

Quick Start

llama.cpp

# Download a quantization (example: F16 split files)
huggingface-cli download DJLougen/Ornstein-122-A10B-gguf --local-dir .

# Run with llama.cpp
./llama-cli -m Ornstein-122-A10B-F16-00001-of-00006.gguf \
  -p "You are a helpful reasoning assistant." \
  --temp 0.6 -n 8192

Ollama

# Create a Modelfile
cat <<EOF > Modelfile
FROM ./Ornstein-122-A10B-F16-00001-of-00006.gguf
PARAMETER temperature 0.6
PARAMETER num_predict 8192
SYSTEM "You are a helpful reasoning assistant."
EOF

ollama create ornstein-122 -f Modelfile
ollama run ornstein-122

LM Studio

Download the desired quantization from the Files tab
Load it in LM Studio
Set context length to 8192 for full reasoning depth

Recommended Settings

Parameter	Suggested Value
Temperature	0.6
Top-P	0.95
Max Tokens	8192
Repeat Penalty	1.1

Training Details

Parameter	Value
Base Model	`unsloth/Qwen3.5-122B-A10B`
Architecture	Mixture-of-Experts (122B total, ~10B active)
Method	LoRA (rank 32, alpha 32)
Dropout	0.0
Epochs	1
Learning Rate	1e-4 (cosine schedule, 10% warmup)
Max Sequence Length	8192
Micro Batch Size	1
Gradient Accumulation	4 steps
Weight Decay	0.01
LoRA Targets	q_proj, k_proj, v_proj, o_proj
Framework	Unsloth

Intended Use

Designed for tasks requiring structured, multi-step reasoning:

Mathematics
Logic problems
Code analysis
Scientific problems
Complex question answering

The MoE architecture makes it practical to run 122B-class reasoning on hardware that couldn't handle a dense model of the same size.

Limitations

Single epoch training means the model retains most base Qwen 3.5 122B-A10B behavior; the fine-tune primarily shapes reasoning style rather than injecting new knowledge
Language scope: DDM pipeline optimized for English; other languages reflect base model performance
Edge cases: Extended thinking can occasionally loop on adversarial or highly ambiguous prompts
Size: Even quantized, the 122B MoE model requires substantial storage and memory

Citation

@misc{ornstein122a10b,
  author = {DJLougen},
  title = {Ornstein-122-A10B: DDM-Curated Reasoning Fine-Tune of Qwen 3.5 122B-A10B},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/DJLougen/Ornstein-122-A10B}
}

Model tree for DJLougen/Ornstein-122-A10B-gguf

Base model

Qwen/Qwen3.5-122B-A10B

Finetuned

unsloth/Qwen3.5-122B-A10B