Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2010.11929

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published Mar 10 • 151
Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights

Paper • 2603.12228 • Published Mar 12 • 12
Efficient Memory Management for Large Language Model Serving with PagedAttention

Paper • 2309.06180 • Published Sep 12, 2023 • 53
1-bit AI Infra: Part 1.1, Fast and Lossless BitNet b1.58 Inference on CPUs

Paper • 2410.16144 • Published Oct 21, 2024 • 5

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Paper • 2010.11929 • Published Oct 22, 2020 • 15

Papers Referred by VLA Survey Referred Papers

Transporter Networks: Rearranging the Visual World for Robotic Manipulation

Paper • 2010.14406 • Published Oct 27, 2020
Learning Transferable Visual Models From Natural Language Supervision

Paper • 2103.00020 • Published Feb 26, 2021 • 21
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Paper • 2010.11929 • Published Oct 22, 2020 • 15

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Paper • 2010.11929 • Published Oct 22, 2020 • 15

Transformer-based Models for Computer Vision

MIO: A Foundation Model on Multimodal Tokens

Paper • 2409.17692 • Published Sep 26, 2024 • 53
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Paper • 2010.11929 • Published Oct 22, 2020 • 15
Going deeper with Image Transformers

Paper • 2103.17239 • Published Mar 31, 2021
Training data-efficient image transformers & distillation through attention

Paper • 2012.12877 • Published Dec 23, 2020 • 2

Papers reimplemented

List of research papers, architectures, and techniques reimplemented in LLM-quest or Hugging Face's TRL. Missing: Qwen3.5, Qwen3-Next, GPT-2

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published Feb 11 • 220
Reinforced Attention Learning

Paper • 2602.04884 • Published Feb 4 • 29
Learning to Reason in 13 Parameters

Paper • 2602.04118 • Published Feb 4 • 6
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters

Paper • 2405.17604 • Published May 27, 2024 • 3

Toolkit - AI Papers

Neural Machine Translation by Jointly Learning to Align and Translate

Paper • 1409.0473 • Published Sep 1, 2014 • 7
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 120
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 50

MemOS: A Memory OS for AI System

Paper • 2507.03724 • Published Jul 4, 2025 • 166
How to Train Your LLM Web Agent: A Statistical Diagnosis

Paper • 2507.04103 • Published Jul 5, 2025 • 52
NeuralOS: Towards Simulating Operating Systems via Neural Generative Models

Paper • 2507.08800 • Published Jul 11, 2025 • 81
Running

3.21k

AnyCoder

🏆

3.21k

Generate code snippets with AI

Image processing

Multispectral Vineyard Segmentation: A Deep Learning approach

Paper • 2108.01200 • Published Aug 2, 2021
PLLaMa: An Open-source Large Language Model for Plant Science

Paper • 2401.01600 • Published Jan 3, 2024 • 3
yusuf802/Leaf-Disease-Predictor

Image Classification • Updated Nov 7, 2023 • 1
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Paper • 2010.11929 • Published Oct 22, 2020 • 15

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Paper • 2010.11929 • Published Oct 22, 2020 • 15

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published Mar 10 • 151
Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights

Paper • 2603.12228 • Published Mar 12 • 12
Efficient Memory Management for Large Language Model Serving with PagedAttention

Paper • 2309.06180 • Published Sep 12, 2023 • 53
1-bit AI Infra: Part 1.1, Fast and Lossless BitNet b1.58 Inference on CPUs

Paper • 2410.16144 • Published Oct 21, 2024 • 5

Papers reimplemented

List of research papers, architectures, and techniques reimplemented in LLM-quest or Hugging Face's TRL. Missing: Qwen3.5, Qwen3-Next, GPT-2

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published Feb 11 • 220
Reinforced Attention Learning

Paper • 2602.04884 • Published Feb 4 • 29
Learning to Reason in 13 Parameters

Paper • 2602.04118 • Published Feb 4 • 6
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters

Paper • 2405.17604 • Published May 27, 2024 • 3

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Paper • 2010.11929 • Published Oct 22, 2020 • 15

Toolkit - AI Papers

Neural Machine Translation by Jointly Learning to Align and Translate

Paper • 1409.0473 • Published Sep 1, 2014 • 7
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 120
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 50

Papers Referred by VLA Survey Referred Papers

Transporter Networks: Rearranging the Visual World for Robotic Manipulation

Paper • 2010.14406 • Published Oct 27, 2020
Learning Transferable Visual Models From Natural Language Supervision

Paper • 2103.00020 • Published Feb 26, 2021 • 21
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Paper • 2010.11929 • Published Oct 22, 2020 • 15

MemOS: A Memory OS for AI System

Paper • 2507.03724 • Published Jul 4, 2025 • 166
How to Train Your LLM Web Agent: A Statistical Diagnosis

Paper • 2507.04103 • Published Jul 5, 2025 • 52
NeuralOS: Towards Simulating Operating Systems via Neural Generative Models

Paper • 2507.08800 • Published Jul 11, 2025 • 81
Running

3.21k

AnyCoder

🏆

3.21k

Generate code snippets with AI

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Paper • 2010.11929 • Published Oct 22, 2020 • 15

Image processing

Multispectral Vineyard Segmentation: A Deep Learning approach

Paper • 2108.01200 • Published Aug 2, 2021
PLLaMa: An Open-source Large Language Model for Plant Science

Paper • 2401.01600 • Published Jan 3, 2024 • 3
yusuf802/Leaf-Disease-Predictor

Image Classification • Updated Nov 7, 2023 • 1
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Paper • 2010.11929 • Published Oct 22, 2020 • 15

Transformer-based Models for Computer Vision

MIO: A Foundation Model on Multimodal Tokens

Paper • 2409.17692 • Published Sep 26, 2024 • 53
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Paper • 2010.11929 • Published Oct 22, 2020 • 15
Going deeper with Image Transformers

Paper • 2103.17239 • Published Mar 31, 2021
Training data-efficient image transformers & distillation through attention

Paper • 2012.12877 • Published Dec 23, 2020 • 2

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Paper • 2010.11929 • Published Oct 22, 2020 • 15

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs