Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2501.08313

Foundational & Modern AI Research (Curated)

A curated selection of foundational and modern AI research papers that meaningfully influence how real-world AI systems are designed, evaluated, and g

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 120
Scaling Laws for Neural Language Models

Paper • 2001.08361 • Published Jan 23, 2020 • 10
Training Compute-Optimal Large Language Models

Paper • 2203.15556 • Published Mar 29, 2022 • 11
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT

Paper • 2210.04186 • Published Oct 9, 2022

Open-Source Foundations for Modern AI Systems

open-source libraries that form the infrastructure layer of modern AI systems, spanning model dev, retrieval, orchestration, evaluation, and MLOPS.

A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale

Paper • 2309.06497 • Published Sep 12, 2023 • 7
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 251
Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 950

This collection is a list of papers I find to be very interesting.

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14, 2025 • 302
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 320
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4, 2025 • 213

MiniMax (Large Language Model) - Original and Transformers Compatible Weights

MiniMaxAI/MiniMax-Text-01-hf

Text Generation • Updated Jul 9, 2025 • 19.6k • 11
MiniMaxAI/MiniMax-M1-80k-hf

Text Generation • Updated Jul 9, 2025 • 125 • 8
MiniMaxAI/MiniMax-M1-40k-hf

Text Generation • Updated Jul 11, 2025 • 195 • 12
MiniMaxAI/MiniMax-Text-01

Text Generation • Updated Jul 3, 2025 • 12.5k • 652

MiniMaxAI/MiniMax-Text-01

Text Generation • Updated Jul 3, 2025 • 12.5k • 652
MiniMaxAI/MiniMax-VL-01

Image-Text-to-Text • Updated Jul 3, 2025 • 84.8k • 282
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14, 2025 • 302
Running

Agents

121

MiniMaxText01

💬

121

Chat with an AI model using text and images

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 24
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 85
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 153
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models

Paper • 2509.19371 • Published Sep 19, 2025
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Paper • 2505.06708 • Published May 10, 2025 • 11
Selective Attention: Enhancing Transformer through Principled Context Control

Paper • 2411.12892 • Published Nov 19, 2024
A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 193

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14, 2025 • 302
Agent-Ark/Toucan-1.5M

Viewer • Updated Oct 4, 2025 • 1.65M • 3.25k • 207
facebook/natural_reasoning

Viewer • Updated Feb 21, 2025 • 1.15M • 1.46k • 561

Rewnozom/agent-zero-v1-a-01

Text Generation • 4B • Updated Jan 18, 2025 • 15 • 2
TheBloke/MythoMax-L2-13B-GGUF

13B • Updated Sep 27, 2023 • 60.1k • 229
DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF

Text Generation • 18B • Updated Dec 1, 2025 • 56.5k • 528
QuantFactory/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF

Text Generation • 8B • Updated Jul 29, 2024 • 9.08k • 141

To Read collection

interesting papers to read

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31, 2025 • 62
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24, 2025 • 120
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6, 2025 • 113
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18, 2025 • 146

Foundational & Modern AI Research (Curated)

A curated selection of foundational and modern AI research papers that meaningfully influence how real-world AI systems are designed, evaluated, and g

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 120
Scaling Laws for Neural Language Models

Paper • 2001.08361 • Published Jan 23, 2020 • 10
Training Compute-Optimal Large Language Models

Paper • 2203.15556 • Published Mar 29, 2022 • 11
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT

Paper • 2210.04186 • Published Oct 9, 2022

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 24
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 85
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 153
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

Open-Source Foundations for Modern AI Systems

open-source libraries that form the infrastructure layer of modern AI systems, spanning model dev, retrieval, orchestration, evaluation, and MLOPS.

A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale

Paper • 2309.06497 • Published Sep 12, 2023 • 7
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 251
Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 950

How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models

Paper • 2509.19371 • Published Sep 19, 2025
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Paper • 2505.06708 • Published May 10, 2025 • 11
Selective Attention: Enhancing Transformer through Principled Context Control

Paper • 2411.12892 • Published Nov 19, 2024
A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 193

This collection is a list of papers I find to be very interesting.

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14, 2025 • 302
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 320
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4, 2025 • 213

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14, 2025 • 302
Agent-Ark/Toucan-1.5M

Viewer • Updated Oct 4, 2025 • 1.65M • 3.25k • 207
facebook/natural_reasoning

Viewer • Updated Feb 21, 2025 • 1.15M • 1.46k • 561

MiniMax (Large Language Model) - Original and Transformers Compatible Weights

MiniMaxAI/MiniMax-Text-01-hf

Text Generation • Updated Jul 9, 2025 • 19.6k • 11
MiniMaxAI/MiniMax-M1-80k-hf

Text Generation • Updated Jul 9, 2025 • 125 • 8
MiniMaxAI/MiniMax-M1-40k-hf

Text Generation • Updated Jul 11, 2025 • 195 • 12
MiniMaxAI/MiniMax-Text-01

Text Generation • Updated Jul 3, 2025 • 12.5k • 652

Rewnozom/agent-zero-v1-a-01

Text Generation • 4B • Updated Jan 18, 2025 • 15 • 2
TheBloke/MythoMax-L2-13B-GGUF

13B • Updated Sep 27, 2023 • 60.1k • 229
DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF

Text Generation • 18B • Updated Dec 1, 2025 • 56.5k • 528
QuantFactory/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF

Text Generation • 8B • Updated Jul 29, 2024 • 9.08k • 141

MiniMaxAI/MiniMax-Text-01

Text Generation • Updated Jul 3, 2025 • 12.5k • 652
MiniMaxAI/MiniMax-VL-01

Image-Text-to-Text • Updated Jul 3, 2025 • 84.8k • 282
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14, 2025 • 302
Running

Agents

121

MiniMaxText01

💬

121

Chat with an AI model using text and images

To Read collection

interesting papers to read

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31, 2025 • 62
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24, 2025 • 120
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6, 2025 • 113
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18, 2025 • 146

Previous
1
2
3
...
6
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs