view article Article Reproducing and Validating Distributed Muon 🐢✨: A Practical Verification of Communication Efficiency Claims Dec 12, 2025 • 2
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 12 items • Updated 4 days ago • 139
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published Jan 30 • 223
Leaderboards for Arabic Collection A collection for all leaderboards related to the Arabic Language. • 5 items • Updated Dec 9, 2025 • 8
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated Mar 12 • 218
ZeroGPU Spaces Collection ZeroGPU Spaces made by the community • 17 items • Updated Jun 6, 2024 • 247
Qwen3 Collection Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. • 70 items • Updated 2 days ago • 269
EAGLE3 Collection The collection of eagle3 series models for Qwen3 and Hunyuan. • 15 items • Updated Jan 13 • 4
Unsloth Dynamic 2.0 Quants Collection New 2.0 version of our Dynamic GGUF + Quants. Dynamic 2.0 achieves superior accuracy & SOTA quantization performance. • 86 items • Updated 2 days ago • 532
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 38 items • Updated Mar 2 • 361
MedGemma Release Collection Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 9 items • Updated Mar 12 • 479