Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2408.00118

Papers reimplemented

List of research papers, architectures, and techniques reimplemented in LLM-quest or Hugging Face's TRL. Missing: Qwen3.5, Qwen3-Next, GPT-2

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published Feb 11 • 220
Reinforced Attention Learning

Paper • 2602.04884 • Published Feb 4 • 29
Learning to Reason in 13 Parameters

Paper • 2602.04118 • Published Feb 4 • 6
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters

Paper • 2405.17604 • Published May 27, 2024 • 3

Toolkit - AI Papers

Neural Machine Translation by Jointly Learning to Align and Translate

Paper • 1409.0473 • Published Sep 1, 2014 • 7
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 120
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 50

Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31, 2024 • 78

Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31, 2024 • 78

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27, 2024 • 144
Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 140
Towards a Unified View of Preference Learning for Large Language Models: A Survey

Paper • 2409.02795 • Published Sep 4, 2024 • 72
Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published Sep 5, 2024 • 92

Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 20
Evaluating Large Language Models Trained on Code

Paper • 2107.03374 • Published Jul 7, 2021 • 10
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24
GPT-4 Technical Report

Paper • 2303.08774 • Published Mar 15, 2023 • 7

Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31, 2024 • 78

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 265
A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8, 2025 • 94
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 20
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Paper • 1910.10683 • Published Oct 23, 2019 • 18

SLM e Moe structure PHD tesis: SOTA e valutazione parametri

collezione di paper utili per redazione tesi 1-2-3- capitolo da valutare cambio di rotta e gestione PHD

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1, 2025 • 110
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published Jan 2, 2025 • 51
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Paper • 2501.01423 • Published Jan 2, 2025 • 44
REDUCIO! Generating 1024times1024 Video within 16 Seconds using Extremely Compressed Motion Latents

Paper • 2411.13552 • Published Nov 20, 2024

SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding

Paper • 2408.15545 • Published Aug 28, 2024 • 38
Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22, 2024 • 65
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 45
Automated Design of Agentic Systems

Paper • 2408.08435 • Published Aug 15, 2024 • 40

Papers reimplemented

List of research papers, architectures, and techniques reimplemented in LLM-quest or Hugging Face's TRL. Missing: Qwen3.5, Qwen3-Next, GPT-2

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published Feb 11 • 220
Reinforced Attention Learning

Paper • 2602.04884 • Published Feb 4 • 29
Learning to Reason in 13 Parameters

Paper • 2602.04118 • Published Feb 4 • 6
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters

Paper • 2405.17604 • Published May 27, 2024 • 3

Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 20
Evaluating Large Language Models Trained on Code

Paper • 2107.03374 • Published Jul 7, 2021 • 10
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24
GPT-4 Technical Report

Paper • 2303.08774 • Published Mar 15, 2023 • 7

Toolkit - AI Papers

Neural Machine Translation by Jointly Learning to Align and Translate

Paper • 1409.0473 • Published Sep 1, 2014 • 7
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 120
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 50

Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31, 2024 • 78

Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31, 2024 • 78

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 265
A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8, 2025 • 94
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 20
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Paper • 1910.10683 • Published Oct 23, 2019 • 18

Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31, 2024 • 78

SLM e Moe structure PHD tesis: SOTA e valutazione parametri

collezione di paper utili per redazione tesi 1-2-3- capitolo da valutare cambio di rotta e gestione PHD

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1, 2025 • 110
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published Jan 2, 2025 • 51
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Paper • 2501.01423 • Published Jan 2, 2025 • 44
REDUCIO! Generating 1024times1024 Video within 16 Seconds using Extremely Compressed Motion Latents

Paper • 2411.13552 • Published Nov 20, 2024

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27, 2024 • 144
Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 140
Towards a Unified View of Preference Learning for Large Language Models: A Survey

Paper • 2409.02795 • Published Sep 4, 2024 • 72
Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published Sep 5, 2024 • 92

SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding

Paper • 2408.15545 • Published Aug 28, 2024 • 38
Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22, 2024 • 65
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 45
Automated Design of Agentic Systems

Paper • 2408.08435 • Published Aug 15, 2024 • 40

Previous
1
2
3
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs