Jackrong/Qwen3.5-4B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 5B • Updated 8 days ago • 15.7k • 28
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 28B • Updated 8 days ago • 585k • 2.62k
view article Article A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons Feb 4, 2025 • 33
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! Paper • 2502.07374 • Published Feb 11, 2025 • 40