Running Featured 33 Distilling 100B+ Models 40x Faster with TRL π 33 TRL distillation for 100B+ teachers, 40x faster
view article Article Multimodal Embedding & Reranker Models with Sentence Transformers 5 days ago β’ 38
view article Article AssetOpsBench: Bridging the Gap Between AI Agent Benchmarks and Industrial Reality Jan 21 β’ 32
view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 Mar 10 β’ 124
Running on CPU Upgrade 219 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens π 219 Explore synthetic data experiments on a virtual bookshelf
Running on CPU Upgrade Featured 3.1k The Smol Training Playbook π 3.1k The secrets to building world-class LLMs
Running Featured 70 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems π 70 Who needs 1T parameters? Olympiad proofs with a 4B model
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model Paper β’ 2510.14528 β’ Published Oct 16, 2025 β’ 124
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA +3 May 24, 2023 β’ 176