Models that I personally recommend, periodically updated.
Doctor Shotgun
Doctor-Shotgun
AI & ML interests
Local ML enthusiast, LLM and diffusion finetuner, hobbyist developer
Recent Activity
updated a model about 20 hours ago
CPU-Hybrid-MoE/MiniMax-M2.7-CPU-NUMA4-AMXINT8 updated a model about 20 hours ago
CPU-Hybrid-MoE/GLM-5.1-CPU-NUMA4-AMXINT8 updated a model about 20 hours ago
CPU-Hybrid-MoE/GLM-5-CPU-NUMA4-AMXINT8Organizations
Doc's Diffusion
Models/loras for image diffusion.
LLM Speculative Decoding Experiments
Tiny language models meant to serve as draft models for speculative decoding.
-
Doctor-Shotgun/TinyLlama-1.1B-32k
Text Generation • 1B • Updated • 609 • 30 -
Doctor-Shotgun/TinyLlama-1.1B-32k-Instruct
Text Generation • 1B • Updated • 641 • • 13 -
Doctor-Shotgun/smol_llama-220M-GQA-32k-theta
Text Generation • Updated • 4 • 1 -
Doctor-Shotgun/smol_llama-220M-GQA-32k-theta-sft
Text Generation • Updated • 4 • 2
Magnum Diamond (24B/70B/123B)
Focusing on applying enough heat and pressure to dry, assistant-tuned models until they turn into creative writing gems!
-
Doctor-Shotgun/ML2-123B-Magnum-Diamond
Text Generation • 123B • Updated • 10 • 11 -
Doctor-Shotgun/L3.3-70B-Magnum-Diamond
Text Generation • 71B • Updated • 18 • 5 -
Doctor-Shotgun/MS3.2-24B-Magnum-Diamond
Text Generation • 24B • Updated • 96 • 56 -
Doctor-Shotgun/ML2-123B-Magnum-Diamond-GGUF
Text Generation • 123B • Updated • 314 • 6
Qwen 3 ScatterMoE
Drop-in implementation of https://github.com/shawntan/scattermoe for efficient training of Qwen 3 MoE.
-
chargoddard/Qwen3-30B-A3B-Base-ScatterMoE
31B • Updated • 3 -
Doctor-Shotgun/Qwen3-30B-A3B-Instruct-2507-ScatterMoE
Text Generation • 31B • Updated • 6 • 1 -
Doctor-Shotgun/Qwen3-30B-A3B-Thinking-2507-ScatterMoE
Text Generation • 31B • Updated • 16 -
Doctor-Shotgun/Qwen3-Coder-30B-A3B-Instruct-ScatterMoE
Text Generation • 31B • Updated • 18 • 1
Doc's Choice
Models that I personally recommend, periodically updated.
Magnum Diamond (24B/70B/123B)
Focusing on applying enough heat and pressure to dry, assistant-tuned models until they turn into creative writing gems!
-
Doctor-Shotgun/ML2-123B-Magnum-Diamond
Text Generation • 123B • Updated • 10 • 11 -
Doctor-Shotgun/L3.3-70B-Magnum-Diamond
Text Generation • 71B • Updated • 18 • 5 -
Doctor-Shotgun/MS3.2-24B-Magnum-Diamond
Text Generation • 24B • Updated • 96 • 56 -
Doctor-Shotgun/ML2-123B-Magnum-Diamond-GGUF
Text Generation • 123B • Updated • 314 • 6
Doc's Diffusion
Models/loras for image diffusion.
Qwen 3 ScatterMoE
Drop-in implementation of https://github.com/shawntan/scattermoe for efficient training of Qwen 3 MoE.
-
chargoddard/Qwen3-30B-A3B-Base-ScatterMoE
31B • Updated • 3 -
Doctor-Shotgun/Qwen3-30B-A3B-Instruct-2507-ScatterMoE
Text Generation • 31B • Updated • 6 • 1 -
Doctor-Shotgun/Qwen3-30B-A3B-Thinking-2507-ScatterMoE
Text Generation • 31B • Updated • 16 -
Doctor-Shotgun/Qwen3-Coder-30B-A3B-Instruct-ScatterMoE
Text Generation • 31B • Updated • 18 • 1
LLM Speculative Decoding Experiments
Tiny language models meant to serve as draft models for speculative decoding.
-
Doctor-Shotgun/TinyLlama-1.1B-32k
Text Generation • 1B • Updated • 609 • 30 -
Doctor-Shotgun/TinyLlama-1.1B-32k-Instruct
Text Generation • 1B • Updated • 641 • • 13 -
Doctor-Shotgun/smol_llama-220M-GQA-32k-theta
Text Generation • Updated • 4 • 1 -
Doctor-Shotgun/smol_llama-220M-GQA-32k-theta-sft
Text Generation • Updated • 4 • 2