view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 14 days ago • 850
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training Paper • 2311.17049 • Published Nov 28, 2023 • 6
WebWorld: A Large-Scale World Model for Web Agent Training Paper • 2602.14721 • Published Feb 16 • 11
From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models Paper • 2602.22859 • Published Feb 26 • 151
view article Article From Golden Gate Bridge to Broken JSON: Why Anthropic's SAE Steering Fails for Structured Output Feb 7 • 22
view article Article Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms Nov 20, 2025 • 42
view article Article Scaling OpenEnv: From Free Usage to Thousands of Concurrent Environments Jan 20 • 12
Baichuan-M3 Collection Modeling Clinical Inquiry for Reliable Medical Decision-Making • 6 items • Updated Mar 2 • 17
Vision Language Models: 2025 Update Collection This collection includes all the models, datasets and Spaces mentioned in the blog Vision Language Models: 2025 Update • 64 items • Updated Mar 2 • 6
📝 Research & Long-Form Blog Posts Collection In-depth technical articles and research pieces published by Hugging Face • 12 items • Updated 2 days ago • 21