Embarrassingly Simple Self-Distillation Improves Code Generation Paper • 2604.01193 • Published 11 days ago • 34
Qwopus3.5-v3 Collection 🌟Qwopus3.5-v3 is the latest model in the Claude series. • 12 items • Updated 3 days ago • 70
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain Paper • 2509.26507 • Published Sep 30, 2025 • 550
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 30 items • Updated Feb 25 • 135
ICONN 1 GenAI Collection Video and Image generation ICONN 1 models • 2 items • Updated Jun 18, 2025 • 2