Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2503.19786

Papers reimplemented

List of research papers, architectures, and techniques reimplemented in LLM-quest or Hugging Face's TRL. Missing: Qwen3.5, Qwen3-Next, GPT-2

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published Feb 11 • 220
Reinforced Attention Learning

Paper • 2602.04884 • Published Feb 4 • 29
Learning to Reason in 13 Parameters

Paper • 2602.04118 • Published Feb 4 • 6
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters

Paper • 2405.17604 • Published May 27, 2024 • 3

Papers I Have Read

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25, 2025 • 55
WithAnyone: Towards Controllable and ID Consistent Image Generation

Paper • 2510.14975 • Published Oct 16, 2025 • 86

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25, 2025 • 55

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25, 2025 • 55
Skin-R1: Toward Trustworthy Clinical Reasoning for Dermatological Diagnosis

Paper • 2511.14900 • Published Nov 18, 2025

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25, 2025 • 55
Kimi-VL Technical Report

Paper • 2504.07491 • Published Apr 10, 2025 • 139
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14, 2025 • 308
FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding

Paper • 2504.09925 • Published Apr 14, 2025 • 39

Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 20
Evaluating Large Language Models Trained on Code

Paper • 2107.03374 • Published Jul 7, 2021 • 10
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24
GPT-4 Technical Report

Paper • 2303.08774 • Published Mar 15, 2023 • 7

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8, 2025 • 211
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Paper • 2508.14444 • Published Aug 20, 2025 • 47
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7, 2025 • 67
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16, 2025 • 274

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25, 2025 • 55
Running

Agents

Featured

353

MiniMax M1

💬

353

Generate web page code from your description

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25, 2025 • 55

Reinforcement Learning: An Overview

Paper • 2412.05265 • Published Dec 6, 2024 • 8
Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis

Paper • 2411.01156 • Published Nov 2, 2024 • 13
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness

Paper • 2503.21755 • Published Mar 27, 2025 • 33
Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26, 2025 • 172

Papers reimplemented

List of research papers, architectures, and techniques reimplemented in LLM-quest or Hugging Face's TRL. Missing: Qwen3.5, Qwen3-Next, GPT-2

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published Feb 11 • 220
Reinforced Attention Learning

Paper • 2602.04884 • Published Feb 4 • 29
Learning to Reason in 13 Parameters

Paper • 2602.04118 • Published Feb 4 • 6
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters

Paper • 2405.17604 • Published May 27, 2024 • 3

Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 20
Evaluating Large Language Models Trained on Code

Paper • 2107.03374 • Published Jul 7, 2021 • 10
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24
GPT-4 Technical Report

Paper • 2303.08774 • Published Mar 15, 2023 • 7

Papers I Have Read

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25, 2025 • 55
WithAnyone: Towards Controllable and ID Consistent Image Generation

Paper • 2510.14975 • Published Oct 16, 2025 • 86

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8, 2025 • 211
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Paper • 2508.14444 • Published Aug 20, 2025 • 47
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7, 2025 • 67
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16, 2025 • 274

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25, 2025 • 55

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25, 2025 • 55
Running

Agents

Featured

353

MiniMax M1

💬

353

Generate web page code from your description

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25, 2025 • 55
Skin-R1: Toward Trustworthy Clinical Reasoning for Dermatological Diagnosis

Paper • 2511.14900 • Published Nov 18, 2025

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25, 2025 • 55

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25, 2025 • 55
Kimi-VL Technical Report

Paper • 2504.07491 • Published Apr 10, 2025 • 139
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14, 2025 • 308
FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding

Paper • 2504.09925 • Published Apr 14, 2025 • 39

Reinforcement Learning: An Overview

Paper • 2412.05265 • Published Dec 6, 2024 • 8
Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis

Paper • 2411.01156 • Published Nov 2, 2024 • 13
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness

Paper • 2503.21755 • Published Mar 27, 2025 • 33
Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26, 2025 • 172

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs