PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published Dec 4, 2024 • 135
Gemma 4 Collection Gemma 4 is Google's new model family including including E2B, E4B, 26B-A4B, and 31B. • 28 items • Updated 2 days ago • 127
Qwen3.5 Collection Qwen3.5 is Qwen's new model family including Qwen3.5 Small: 0.8B, 2B, 4B, 9B and Qwen3.5 Medium: 35B-A3B, 27B, 122B-A10B and 397B-A17B. • 25 items • Updated 10 days ago • 144
ColBERT-Zero 🐶 Collection First large-scale fully pre-trained ColBERT model using only public data, outperforming GTE-ModernColBERT and GTE-ModernBERT • 10 items • Updated 7 days ago • 20
Tiny-A2D Collection Small diffusion language models adapted from AR models • 4 items • Updated Dec 6, 2025 • 18
DFlash Collection Block Diffusion for Flash Speculative Decoding • 13 items • Updated 9 days ago • 55
view article Article Aligning to What? Rethinking Agent Generalization in MiniMax M2 Oct 30, 2025 • 43
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated Dec 2, 2025 • 164
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence Paper • 2511.07384 • Published Nov 10, 2025 • 19