Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2507.07955

Falcon-H1R: Pushing the Reasoning Frontiers with a Hybrid Model for Efficient Test-Time Scaling

Paper • 2601.02346 • Published Jan 5 • 27
unsloth/alpaca-cleaned

Viewer • Updated Dec 22, 2025 • 51.8k • 9.83k • 8
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 50
Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

Paper • 2507.07955 • Published Jul 10, 2025 • 27

Energy-Based Transformers are Scalable Learners and Thinkers

Paper • 2507.02092 • Published Jul 2, 2025 • 69
MOSPA: Human Motion Generation Driven by Spatial Audio

Paper • 2507.11949 • Published Jul 16, 2025 • 25
Sound and Complete Neuro-symbolic Reasoning with LLM-Grounded Interpretations

Paper • 2507.09751 • Published Jul 13, 2025 • 2
Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling

Paper • 2507.07982 • Published Jul 10, 2025 • 34

Architectural Proposals

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 108
Causal Diffusion Transformers for Generative Modeling

Paper • 2412.12095 • Published Dec 16, 2024 • 23
Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11, 2025 • 90
TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published Feb 11, 2025 • 69

Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 50
Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

Paper • 2507.07955 • Published Jul 10, 2025 • 27
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published May 5, 2025 • 82
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

Paper • 2508.02193 • Published Aug 4, 2025 • 138

Papers I have written about on my blog.

MARS: A Multi-Agent Framework Incorporating Socratic Guidance for Automated Prompt Optimization

Paper • 2503.16874 • Published Mar 21, 2025 • 45
System Prompt Optimization with Meta-Learning

Paper • 2505.09666 • Published May 14, 2025 • 71
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning

Paper • 2505.23380 • Published May 29, 2025 • 22
DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning

Paper • 2505.23754 • Published May 29, 2025 • 15

Falcon-H1R: Pushing the Reasoning Frontiers with a Hybrid Model for Efficient Test-Time Scaling

Paper • 2601.02346 • Published Jan 5 • 27
unsloth/alpaca-cleaned

Viewer • Updated Dec 22, 2025 • 51.8k • 9.83k • 8
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 50
Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

Paper • 2507.07955 • Published Jul 10, 2025 • 27

Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 50
Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

Paper • 2507.07955 • Published Jul 10, 2025 • 27
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published May 5, 2025 • 82
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

Paper • 2508.02193 • Published Aug 4, 2025 • 138

Energy-Based Transformers are Scalable Learners and Thinkers

Paper • 2507.02092 • Published Jul 2, 2025 • 69
MOSPA: Human Motion Generation Driven by Spatial Audio

Paper • 2507.11949 • Published Jul 16, 2025 • 25
Sound and Complete Neuro-symbolic Reasoning with LLM-Grounded Interpretations

Paper • 2507.09751 • Published Jul 13, 2025 • 2
Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling

Paper • 2507.07982 • Published Jul 10, 2025 • 34

Papers I have written about on my blog.

MARS: A Multi-Agent Framework Incorporating Socratic Guidance for Automated Prompt Optimization

Paper • 2503.16874 • Published Mar 21, 2025 • 45
System Prompt Optimization with Meta-Learning

Paper • 2505.09666 • Published May 14, 2025 • 71
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning

Paper • 2505.23380 • Published May 29, 2025 • 22
DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning

Paper • 2505.23754 • Published May 29, 2025 • 15

Architectural Proposals

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 108
Causal Diffusion Transformers for Generative Modeling

Paper • 2412.12095 • Published Dec 16, 2024 • 23
Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11, 2025 • 90
TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published Feb 11, 2025 • 69

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs