Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published Nov 27, 2025 • 245
Towards Scalable Pre-training of Visual Tokenizers for Generation Paper • 2512.13687 • Published Dec 15, 2025 • 106
LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published Dec 10, 2025 • 88
DEER: Draft with Diffusion, Verify with Autoregressive Models Paper • 2512.15176 • Published Dec 17, 2025 • 45
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 Paper • 2408.05147 • Published Aug 9, 2024 • 41
A Survey of Context Engineering for Large Language Models Paper • 2507.13334 • Published Jul 17, 2025 • 263
Aryabhata: An exam-focused language model for JEE Math Paper • 2508.08665 • Published Aug 12, 2025 • 17
Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models Paper • 2508.02120 • Published Aug 4, 2025 • 20
Train Long, Think Short: Curriculum Learning for Efficient Reasoning Paper • 2508.08940 • Published Aug 12, 2025 • 27
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts Paper • 2508.07785 • Published Aug 11, 2025 • 29
Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report Paper • 2508.01059 • Published Aug 1, 2025 • 34
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference Paper • 2508.02193 • Published Aug 4, 2025 • 138
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7, 2025 • 191