๐ TRL v0.29.0 introduces trl-training: an agent-native training skill.
This makes the TRL CLI a structured, agent-readable capability, allowing AI agents to reliably execute training workflows such as: - Supervised Fine-Tuning (SFT) - Direct Preference Optimization (DPO) - Group Relative Policy Optimization (GRPO)
Weโre excited to see what the community builds on top of this.
If youโre working on AI agents, alignment research, or scalable RL training infrastructure: give TRL v0.29.0 a try! ๐ค
Interesting article: use Claude Code to help open models write CUDA kernels (for eg) by turning CC traces into Skills. They made a library out of it ๐
We prepared the 2025 version of the HF AI Timeline Grid, highlighting open vs API-based model releases, and allowing you to browse and filter by access, modality, and release type!
1๏ธโฃ Q1 โ Learning to Reason Deepseek not only releases a top-notch reasoning model, but shows how to train them and compete with closed frontier models. OpenAI debuts Deep Research.
Significant milestones: DeepSeek R1 & R1-Zero, Qwen 2.5 VL, OpenAI Deep Research, Gemini 2.5 Pro (experimental)
2๏ธโฃ Q2 โ Multimodality and Coding More LLMs embrace multimodality by default, and there's a surge in coding agents. Strong vision, audio, and generative models emerge.
Significant milestones: Llama 4, Qwen 3, Imagen 4, OpenAI Codex, Google Jules, Claude 4
3๏ธโฃ Q3 โ "Gold" rush, OpenAI opens up, the community goes bananas Flagship models get gold in Math olympiads and hard benchmarks. OpenAI releases strong open source models and Google releases the much anticipated nano-banana for image generation and editing. Agentic workflows become commonplace.
Significant milestones: Gemini and OpenAI IMO Gold, gpt-oss, Gemini 2.5 Flash Image, Grok 4, Claude Sonnet 4.5
4๏ธโฃ Q4 โ Mistral returns, leaderboard hill-climbing Mistral is back with updated model families. All labs release impressive models to wrap up the year!
Significant milestones: Claude Opus 4.5, DeepSeek Math V2, FLUX 2, GPT 5.1, Kimi K2 Thinking, Nano Banana Pro, GLM 4.7, Gemini 3, Mistral 3, MiniMax M2.1 ๐คฏ
Nvidia is on a roll lately. Nemotron 3 Nano is my new fav local model, but here's the real flex: they published the entire evaluation setup. Configs, prompts, logs, all of it. This is how you do open models ๐ฅ
Weโre thrilled to announce that the Qwen3-VL family of vision-language models is now available on Azure AI Foundry, thanks to our collaboration with Microsoft.
We bring open-source innovation to enterprise-grade AI infrastructure, making it easier than ever for enterprise to deploy and scale the latest and greatest from models from hugging Face securely within Azure.
๐ Highlights:
- Deploy Qwen3-VL instantly via managed endpoints - Built-in governance, telemetry, and lifecycle management - True multimodal reasoning โ vision, language, and code understanding - State-of-the-art performance, outperforming closed-source models like Gemini 2.5 Pro and GPT-5 - Available in both *Instruct* and *Thinking* modes, across 24 model sizes
๐ Get started today: search for Qwen3-VL in the Hugging Face Collection on Azure AI Foundry.
๐ New blog: Maintain the unmaintainable โ 1M+ Python LOC, 400+ models
How do you stop a million-line library built by thousands of contributors from collapsing under its own weight? At ๐ค Transformers, we do it with explicit software-engineering tenets, principles that make the codebase hackable at scale.
๐ Inside the post: โ One Model, One File: readability first โ you can still open a modeling file and see the full logic, top to bottom. โ Modular Transformers: visible inheritance that cuts maintenance cost by ~15ร while keeping models readable. โ Config-Driven Performance: FlashAttention, tensor parallelism, and attention scheduling are config-level features, not rewrites.
Written with @lysandre,@pcuenq and @yonigozlan, this is a deep dive into how Transformers stays fast, open, and maintainable.