Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:1910.01108

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 22

article collection

EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning

Paper • 2509.22576 • Published Sep 26, 2025 • 137
AgentBench: Evaluating LLMs as Agents

Paper • 2308.03688 • Published Aug 7, 2023 • 26
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 22
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 64

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published Apr 2, 2025 • 87
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 108
BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16, 2025 • 85
FAST: Efficient Action Tokenization for Vision-Language-Action Models

Paper • 2501.09747 • Published Jan 16, 2025 • 29

on device modules low resorces

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 22

Running

3.21k

AnyCoder

🏆

3.21k

Generate code snippets with AI
Running

Agents

Featured

272

Qwen2.5 Coder Artifacts

🐢

272

Generate and preview web app code from a text description
Running

Agents

Featured

921

QwQ-32B-Preview

🔍

921

QwQ-32B-Preview
Running on CPU Upgrade

14k

Open LLM Leaderboard

🏆

14k

Track, rank and evaluate open LLMs and chatbots

Toolkit - AI Papers

Neural Machine Translation by Jointly Learning to Align and Translate

Paper • 1409.0473 • Published Sep 1, 2014 • 7
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 120
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 50

Sentiment/Emotion Analysis

MilaNLProc/feel-it-italian-sentiment

Text Classification • Updated Aug 15, 2022 • 9.24k • • 21
cardiffnlp/twitter-roberta-base-sentiment-latest

Text Classification • Updated Aug 4, 2025 • 3.45M • • 786
bhadresh-savani/distilbert-base-uncased-emotion

Text Classification • 67M • Updated Aug 14, 2024 • 732k • • 160
FacebookAI/xlm-roberta-large

Fill-Mask • 0.6B • Updated Feb 19, 2024 • 6.41M • • 506

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 22

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022 • 1
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021 • 2
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 2

chat-models-candidates

nikravan/glm-4vq

Document Question Answering • 14B • Updated Jun 16, 2024 • 35 • 36
deepseek-ai/deepseek-coder-33b-instruct

Text Generation • 33B • Updated Mar 7, 2024 • 4.2k • 566
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Paper • 2401.14196 • Published Jan 25, 2024 • 72
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 22

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 22

Toolkit - AI Papers

Neural Machine Translation by Jointly Learning to Align and Translate

Paper • 1409.0473 • Published Sep 1, 2014 • 7
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 120
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 50

article collection

EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning

Paper • 2509.22576 • Published Sep 26, 2025 • 137
AgentBench: Evaluating LLMs as Agents

Paper • 2308.03688 • Published Aug 7, 2023 • 26
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 22
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 64

Sentiment/Emotion Analysis

MilaNLProc/feel-it-italian-sentiment

Text Classification • Updated Aug 15, 2022 • 9.24k • • 21
cardiffnlp/twitter-roberta-base-sentiment-latest

Text Classification • Updated Aug 4, 2025 • 3.45M • • 786
bhadresh-savani/distilbert-base-uncased-emotion

Text Classification • 67M • Updated Aug 14, 2024 • 732k • • 160
FacebookAI/xlm-roberta-large

Fill-Mask • 0.6B • Updated Feb 19, 2024 • 6.41M • • 506

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published Apr 2, 2025 • 87
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 108
BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16, 2025 • 85
FAST: Efficient Action Tokenization for Vision-Language-Action Models

Paper • 2501.09747 • Published Jan 16, 2025 • 29

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 22

on device modules low resorces

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 22

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022 • 1
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021 • 2
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 2

Running

3.21k

AnyCoder

🏆

3.21k

Generate code snippets with AI
Running

Agents

Featured

272

Qwen2.5 Coder Artifacts

🐢

272

Generate and preview web app code from a text description
Running

Agents

Featured

921

QwQ-32B-Preview

🔍

921

QwQ-32B-Preview
Running on CPU Upgrade

14k

Open LLM Leaderboard

🏆

14k

Track, rank and evaluate open LLMs and chatbots

chat-models-candidates

nikravan/glm-4vq

Document Question Answering • 14B • Updated Jun 16, 2024 • 35 • 36
deepseek-ai/deepseek-coder-33b-instruct

Text Generation • 33B • Updated Mar 7, 2024 • 4.2k • 566
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Paper • 2401.14196 • Published Jan 25, 2024 • 72
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 22

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs