Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2603.19835

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published 30 days ago • 338

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published 30 days ago • 338

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30, 2025 • 550
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 322
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 133
LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 176

Reasoning Language Models

Tina: Tiny Reasoning Models via LoRA

Paper • 2504.15777 • Published Apr 22, 2025 • 57
Resa: Transparent Reasoning Models via SAEs

Paper • 2506.09967 • Published Jun 11, 2025 • 22
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published 30 days ago • 338

build- something

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published 30 days ago • 338
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence

Paper • 2603.28032 • Published 20 days ago • 340
VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward

Paper • 2603.26599 • Published 23 days ago • 63

Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

Paper • 2603.04597 • Published Mar 4 • 210
SII-Enigma/Llama3.2-8B-Ins-AMPO

Text Generation • 8B • Updated 29 days ago • 48
Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26, 2025 • 59
Planner-R1: Reward Shaping Enables Efficient Agentic RL with Smaller LLMs

Paper • 2509.25779 • Published Sep 30, 2025 • 19

Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published Dec 15, 2025 • 106
MMGR: Multi-Modal Generative Reasoning

Paper • 2512.14691 • Published Dec 16, 2025 • 121
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

Paper • 2512.23447 • Published Dec 29, 2025 • 99
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation

Paper • 2512.23576 • Published Dec 29, 2025 • 66

R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization

Paper • 2503.10615 • Published Mar 13, 2025 • 17
UniGoal: Towards Universal Zero-shot Goal-oriented Navigation

Paper • 2503.10630 • Published Mar 13, 2025 • 6
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 39
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL

Paper • 2503.07536 • Published Mar 10, 2025 • 88

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published 30 days ago • 338

build- something

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published 30 days ago • 338
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence

Paper • 2603.28032 • Published 20 days ago • 340
VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward

Paper • 2603.26599 • Published 23 days ago • 63

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published 30 days ago • 338

Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

Paper • 2603.04597 • Published Mar 4 • 210
SII-Enigma/Llama3.2-8B-Ins-AMPO

Text Generation • 8B • Updated 29 days ago • 48
Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26, 2025 • 59
Planner-R1: Reward Shaping Enables Efficient Agentic RL with Smaller LLMs

Paper • 2509.25779 • Published Sep 30, 2025 • 19

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30, 2025 • 550
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 322
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 133
LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 176

Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published Dec 15, 2025 • 106
MMGR: Multi-Modal Generative Reasoning

Paper • 2512.14691 • Published Dec 16, 2025 • 121
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

Paper • 2512.23447 • Published Dec 29, 2025 • 99
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation

Paper • 2512.23576 • Published Dec 29, 2025 • 66

Reasoning Language Models

Tina: Tiny Reasoning Models via LoRA

Paper • 2504.15777 • Published Apr 22, 2025 • 57
Resa: Transparent Reasoning Models via SAEs

Paper • 2506.09967 • Published Jun 11, 2025 • 22
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published 30 days ago • 338

R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization

Paper • 2503.10615 • Published Mar 13, 2025 • 17
UniGoal: Towards Universal Zero-shot Goal-oriented Navigation

Paper • 2503.10630 • Published Mar 13, 2025 • 6
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 39
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL

Paper • 2503.07536 • Published Mar 10, 2025 • 88

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs