Running Featured 71 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems π 71 Who needs 1T parameters? Olympiad proofs with a 4B model
Running 79 Maintain the unmaintainable π 79 Explore the complex relationships between 400+ machine learning models
Running 221 FineVision: Open Data is All You Need π 221 A new open-source dataset for training VLMs
Running 91 Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks π 91 Evaluate multilingual models using FineTasks
Running on CPU Upgrade 220 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens π 220 Explore synthetic data experiments on a virtual bookshelf
Running on CPU Upgrade 13.9k Open LLM Leaderboard π 13.9k Track, rank and evaluate open LLMs and chatbots
Running on CPU Upgrade Featured 3.11k The Smol Training Playbook π 3.11k The secrets to building world-class LLMs
Running 3.79k The Ultra-Scale Playbook π 3.79k The ultimate guide to training LLM on large GPU Clusters
Runtime error Agents Featured 2.77k XTTS πΈ 2.77k Generate speech from text using a reference voice
Running 596 Scaling test-time compute π 596 Run advanced search strategies to boost LLM problem solving
Running Featured 1.33k FineWeb: decanting the web for the finest text data at scale π· 1.33k Read a detailed overview of the FineWeb webβscale text dataset