view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels Aug 18, 2025 • 97
view article Article The Engineering Handbook for GRPO + LoRA with Verl: Training Qwen2.5 on Multi-GPU Jan 2 • 19
Running on CPU Upgrade Featured 3.11k The Smol Training Playbook 📚 3.11k The secrets to building world-class LLMs
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch +5 May 21, 2025 • 255
Running 3.79k The Ultra-Scale Playbook 🌌 3.79k The ultimate guide to training LLM on large GPU Clusters
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 253
Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights Paper • 2502.09619 • Published Feb 13, 2025 • 36