Running Featured 46 Distilling 100B+ Models 40x Faster with TRL π 46 TRL distillation for 100B+ teachers, 40x faster
LGAI-EXAONE/EXAONE-4.5-33B Image-Text-to-Text β’ 34B β’ Updated about 5 hours ago β’ 6.63k β’ 139
Jiunsong/supergemma4-26b-uncensored-gguf-v2 Text Generation β’ 25B β’ Updated 3 days ago β’ 26.7k β’ 259
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper β’ 2604.06628 β’ Published 7 days ago β’ 309
Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision Paper β’ 2604.04934 β’ Published 9 days ago β’ 42
Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents Paper β’ 2604.06132 β’ Published 8 days ago β’ 114
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience Paper β’ 2603.24533 β’ Published 21 days ago β’ 47
Look Where It Matters: High-Resolution Crops Retrieval for Efficient VLMs Paper β’ 2603.16932 β’ Published Mar 14 β’ 87
view post Post 5464 Surya-1.1T: Scaling Beyond Human-Level Reasoning via 146 Trillion Token Pre-trainingAuthor: SKT AI LABSAffiliation: SKT AI Labs / Project SuryaModel Architecture: Optimized Dense TransformerParameters: 1.1 TrillionTraining Tokens: 146 TrillionWanna collaborate us Friends let's Start Journey we have Collected 146 trillon tokens and done pre training but we need to made more powerfullWhitepaper - https://github.com/SHRIJANAGAIN/PROFF See translation 57 replies Β· π₯ 15 15 π 9 9 π 8 8 π€ 7 7 β 7 7 π 6 6 β€οΈ 6 6 π 5 5 π§ 5 5 π€ 5 5 π€― 3 3 + Reply