Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:1810.04805

Semlan112/Hug

Updated Mar 9
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
MiniMaxAI/MiniMax-M2.5

Text Generation • 229B • Updated Mar 10 • 919k • • 1.46k
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled

Image-Text-to-Text • 28B • Updated 14 days ago • 573k • 2.73k

paper digestion

Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 20
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 120
Lookahead Anchoring: Preserving Character Identity in Audio-Driven Human Animation

Paper • 2510.23581 • Published Oct 27, 2025 • 42

Toolkit - AI Papers

Neural Machine Translation by Jointly Learning to Align and Translate

Paper • 1409.0473 • Published Sep 1, 2014 • 7
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 120
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 50

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

Paper • 2006.03654 • Published Jun 5, 2020 • 3
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Paper • 1907.11692 • Published Jul 26, 2019 • 10
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 20

Uncertainty Quantification

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26

High-Resolution Image Synthesis with Latent Diffusion Models

Paper • 2112.10752 • Published Dec 20, 2021 • 17
Adding Conditional Control to Text-to-Image Diffusion Models

Paper • 2302.05543 • Published Feb 10, 2023 • 58
Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 11
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 64

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26

Semlan112/Hug

Updated Mar 9
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
MiniMaxAI/MiniMax-M2.5

Text Generation • 229B • Updated Mar 10 • 919k • • 1.46k
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled

Image-Text-to-Text • 28B • Updated 14 days ago • 573k • 2.73k

Uncertainty Quantification

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26

paper digestion

Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 20
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 120
Lookahead Anchoring: Preserving Character Identity in Audio-Driven Human Animation

Paper • 2510.23581 • Published Oct 27, 2025 • 42

High-Resolution Image Synthesis with Latent Diffusion Models

Paper • 2112.10752 • Published Dec 20, 2021 • 17
Adding Conditional Control to Text-to-Image Diffusion Models

Paper • 2302.05543 • Published Feb 10, 2023 • 58
Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 11
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 64

Toolkit - AI Papers

Neural Machine Translation by Jointly Learning to Align and Translate

Paper • 1409.0473 • Published Sep 1, 2014 • 7
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 120
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 50

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

Paper • 2006.03654 • Published Jun 5, 2020 • 3
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Paper • 1907.11692 • Published Jul 26, 2019 • 10
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 20

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26

Previous
1
2
3
...
5
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs