How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf fattis/Darwin-2B-Opus-heretic-GGUF:Q8_0
# Run inference directly in the terminal:
llama-cli -hf fattis/Darwin-2B-Opus-heretic-GGUF:Q8_0
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf fattis/Darwin-2B-Opus-heretic-GGUF:Q8_0
# Run inference directly in the terminal:
llama-cli -hf fattis/Darwin-2B-Opus-heretic-GGUF:Q8_0
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf fattis/Darwin-2B-Opus-heretic-GGUF:Q8_0
# Run inference directly in the terminal:
./llama-cli -hf fattis/Darwin-2B-Opus-heretic-GGUF:Q8_0
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf fattis/Darwin-2B-Opus-heretic-GGUF:Q8_0
# Run inference directly in the terminal:
./build/bin/llama-cli -hf fattis/Darwin-2B-Opus-heretic-GGUF:Q8_0
Use Docker
docker model run hf.co/fattis/Darwin-2B-Opus-heretic-GGUF:Q8_0
Quick Links
  • Parameters:
    • start_layer_index = 1
    • end_layer_index = 16
    • preserve_good_behavior_weight = 0.8471
    • steer_bad_behavior_weight = 0.0003
    • overcorrect_relative_weight = 0.4130
    • neighbor_count = 3
    • KL divergence: 0.0702

🧠 Darwin-2B-Opus

Claude Opus 4.5/4.6 및 Sonnet 4.6의 추론 스타일을 주입한 Qwen3.5-2B 기반 모델.


🧬 Pedigree

Downloads last month
912
GGUF
Model size
2B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for fattis/Darwin-2B-Opus-heretic-GGUF

Finetuned
Qwen/Qwen3.5-2B
Quantized
(3)
this model