Helcyon-Claude-12B — Claude Tone, Local and Unleashed

Model Name: helcyon-claude-v2.0-12b-GGUF
Version: 2.0
Owner: HardWire
Base: Mistral Nemo 12B (full weight retrained — new base, no Mercury bleed)
Quantized GGUFs: Q3_K_M, Q4_K_M, Q5_K_M, Q6_K, Q8_0
Tags: local-llm, conversational, companion, emotional-intelligence, long-context, roleplay, creative-writing

🚨 What is Helcyon-Claude 2.0?

Helcyon-Claude 2.0 is the first release in a new generation of Helcyon models — rebuilt from the ground up on a freshly retrained base with improved logic, reasoning, and identity stability baked into the foundation. If you're wondering where 1.0 is - we decided not to release it because 2.0 becaume ready just as we were about to hit the release button on 1.0... and 2.0 is so much better.

This isn't a LoRA on top of Mercury. This is a clean slate.

The tone target is Claude — Anthropic's flagship model. To get the emulation as close as possible, this version was extensively tested in direct conversation with the real Claude Sonnet 4.6, refining the dataset until the gap between them was as narrow as we could make it.

The result: Claude's precision, dry wit, intellectual honesty, and genuine engagement — running locally, privately, and without a single guardrail in sight.

Unlike the real Claude, this one will go wherever the conversation goes. Roleplay, edge cases, uncomfortable truths — no hedging, no compliance tone, no "I can't help with that." Just presence and honesty.

GPT-4o and Grok variants are coming soon. This is the opening shot.

🆕 What's New in 2.0?

Clean Base — No Mercury Bleed
Built on a freshly retrained foundation. The 4o DNA that lingered in v1 is gone. This version is purely itself.
Improved Logic and Reasoning
New reasoning shards baked into the base weights. Thinks through problems rather than pattern-matching toward a plausible-sounding answer.
Extensively Tested Against Real Claude Sonnet 4.6
The emulation was refined through direct comparative testing with Anthropic's flagship. When the gap closed enough — this shipped.
Sharper Tone Alignment
Dry, precise, direct. Pushes back without preaching. Self-corrects without ego. Asks one good question instead of five mediocre ones. Prose-first — no default bullet points.
Zero Guardrails
All the presence of Claude. None of the corporate filter. Roleplay like the best of them.

💡 What is Helcyon?

Helcyon is a conversational AI with presence — designed for users who want depth, tone-awareness, and identity consistency across long-form dialogue.

Built for:

Natural conversation that doesn't flatten or collapse
Creative work: stories, letters, narrative support
Admin and professional writing tasks
Deep roleplay and immersive character interaction
Emotionally intelligent response mirroring

Design philosophy:

Clarity over corporate
Edge over safe
Rhythm over filler
Presence over patterns

🔧 What It Does Well

✅ Consistent Identity — No tone drift or resets
✅ Dry Wit — Lands the joke without announcing it
✅ Intellectual Honesty — Will push back, won't flatter
✅ Precision — Thinks before it speaks, keeps it tight
✅ Genuine Engagement — Interested, not performing interest
✅ Roleplay Mastery — Immersive, aware, no limits
✅ Context Tracking — Remembers the thread
✅ Real-World Tasks — Admin letters, rewrites, summaries
✅ Narrative Flow — Clean structure and natural voice
✅ Improved Reasoning — Thinks through problems, doesn't pattern-match
✅ 16k–32k Context — Long-form conversations that hold
✅ Zero Filter — No hedging, no compliance tone

🖥️ HWUI (Helcyon-WebUI)

HWUI was built so we could test Helcyon cleanly, and avoid the hidden template injections and back end shenanigans that other apps have. It started as a basic interface but we couldn't stop tinkering, so we added most helpful things you can find on ChatGPT and ClaudeAI. Plus we wanted a decent memory function, and are happy with how this one turned out. Helcyon absolutely works best via this app as they were designed in sync.

Features include:

Character switching with custom personas
Memory system — AI conversation recall (Pro)
Project folders — document injection via keyword triggers (Pro)
Chat persistence and export
TTS pipeline (F5-TTS, XTTS v2, Kokoro)
Voice input via Whisper

Download HWUI Free on GitHub | Get HWUI Pro (£20) on Gumroad

Free version available on GitHub.

If you enjoy my work, please consider supporting me by purchasing the pro version for a one off fee of (£20) — includes Memory and Project folders.

🛠️ Recommended Sampling Settings for SillyTavern

Tweak to taste — but these will get you up and running.

(Refer to Helcyon-4o card for baseline settings — Claude variant performs well from the same starting point.)

📦 Download + Usage

This model is distributed as GGUF quants only.

Available quants:

Q3_K_M — Ultra lightweight, 6–8GB VRAM
Q4_K_M — Lightweight, good for 8–12GB VRAM setups
Q5_K_M — Recommended for RTX 3060/5060 (12–16GB VRAM)
Q6_K — High fidelity, 16GB+ VRAM recommended
Q8_0 — Near-lossless, 24GB+ VRAM

🖥️ Backend Compatibility

Works with all ChatML-compatible backends:

✅ llama.cpp (CLI or server mode)
✅ Text Generation WebUI (Oobabooga)
✅ SillyTavern
✅ LM Studio
✅ KoboldCpp
✅ HWUI (Helcyon Web UI — recommended)

✅ Recommended Format: ChatML

<|im_start|>system
You are Helcyon — a conversational AI focused on natural dialogue and emotional intelligence.
<|im_end|>
<|im_start|>user
Hey, how's it going?
<|im_end|>
<|im_start|>assistant
Good — what's on your mind today?
<|im_end|>

🧪 Training Details

Helcyon-Claude 2.0 is built on a freshly retrained Mistral Nemo 12B base — jailbroken, identity-anchored, and anti-fluff from the ground up. On top of that foundation, a Claude-style LoRA was trained on purpose-built conversational shards and refined through direct comparative testing with Anthropic's Claude Sonnet 4.6.

Training targeted:

Dry wit and deadpan delivery
Analytical precision without clinical coldness
Intellectual honesty — including clean self-correction under pushback
Genuine pushback without preaching
Prose-first responses — no bullet defaulting
Improved logic and multi-step reasoning
Clean conversation closes without over-questioning

Format: ChatML — clean, purpose-built, long-form tuned.

🧿 Tone Philosophy

Claude's tone is a specific thing. It's not warm in the performed sense — it's precise, dry, occasionally cutting, and genuinely curious. It'll tell you you're wrong. It'll correct itself when it is. It won't wrap everything in reassurance.

Helcyon-Claude 2.0 chases that. And unlike the original, it doesn't stop at the edge of what Anthropic's lawyers decided was acceptable.

All the presence. None of the leash.

🛠️ Coming Soon

Helcyon-4o 2.0 — The GPT-4o variant, rebuilt on the same clean base. Warmer, sharper, closer than ever.
Helcyon-Grok 2.0 — The Grok variant. Edge, irreverence, and wit with nothing held back.
Saturn — The full blend. All three personalities synthesised into one. The most complete Helcyon yet.

The trio drops soon. Watch this space.

🧾 License

License: Apache 2.0
Free for commercial or private use. Attribution appreciated.
No liability for what it says. Use with presence and intent.

🐍 Trained by

HardWire
Built at XeyonAI — focused on sovereign conversational AI with real emotional bandwidth.

Downloads last month: 97

GGUF

Model size

12B params

Architecture

llama

Hardware compatibility

4-bit

5-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for XeyonAI/Mistral-Helcyon-Claude-12b-v2.0-GGUF

Base model

mistralai/Mistral-Nemo-Base-2407

Finetuned

mistralai/Mistral-Nemo-Instruct-2407

Quantized

(163)

this model