Text Generation
GGUF
English
256k context
Qwen3
Mixture of Experts
MOE
MOE Dense
2 experts
4Bx12
All use cases
bfloat16
finetune
thinking
reasoning
GPT-5.1-High-Reasoning-Distill
Gemini-3-Pro-Preview-High-Reasoning-Distill
Claude-4.5-Opus-High-Reasoning-Distill
Claude-Sonnet-4-Reasoning-Distill
Kimi-K2-Thinking-Distill
Gemini-2.5-Flash-Distill
Gemini-2.5-Flash-Lite-Preview-Distill
gpt-oss-120b-Distill
GLM-Flash-4.6-Distill
Open-R1-Distill
Command-A-Reasoning-Distill
conversational
Fantastic model!
#1
by FiditeNemini - opened
This has to be one of the best general-use models I've seen, with a huge amount of flexibility. Definitely going onto my daily-driver list. Excellent work, sir!
thank you so much !
I am jaw-dropped at this routing wizardry. DavidAU is one of the coolest people alive in my book.
Even if this isn't some 'top performer' in any way - it's a big contribution - never saw experts routed from distills before.
I wish some other smart person would comment on this technique/method.
/me bows deeply