Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.17764

Foundational & Modern AI Research (Curated)

A curated selection of foundational and modern AI research papers that meaningfully influence how real-world AI systems are designed, evaluated, and g

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 120
Scaling Laws for Neural Language Models

Paper • 2001.08361 • Published Jan 23, 2020 • 10
Training Compute-Optimal Large Language Models

Paper • 2203.15556 • Published Mar 29, 2022 • 11
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT

Paper • 2210.04186 • Published Oct 9, 2022

peteromallet/dataclaw-peteromallet

Viewer • Updated Feb 25 • 549 • 646 • 298
Qwen/Qwen3.5-35B-A3B

Image-Text-to-Text • 36B • Updated Feb 27 • 3.89M • • 1.39k
Nanbeige/Nanbeige4.1-3B

Text Generation • 4B • Updated 25 days ago • 272k • • 1.09k
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628

Open-Source Foundations for Modern AI Systems

open-source libraries that form the infrastructure layer of modern AI systems, spanning model dev, retrieval, orchestration, evaluation, and MLOPS.

A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale

Paper • 2309.06497 • Published Sep 12, 2023 • 7
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 251
Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 951

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 50
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 513
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24

the code and files for a mock credit card dashboard application, including static HTML, CSS, and JavaScript files, as well as sample date files.

khabirc2/my-credit-card-dashboard

Updated Sep 30, 2025
google/embeddinggemma-300m

Sentence Similarity • 0.3B • Updated Sep 25, 2025 • 945k • • 1.61k
Qwen/Qwen3-Next-80B-A3B-Instruct

Text Generation • 81B • Updated Sep 17, 2025 • 304k • • 1.01k
HuggingFaceFW/finepdfs

Viewer • Updated 16 days ago • 476M • 21k • 849

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628

<h1>test</test>

jjjjjjj%24%5C%7B123*456%5C%7Djjjjjjj%3C%25%3D123*567%25%3Ejjjjjjj%5C%7B%5C%7B123*678%5C%7D%5C%7D

deepseek-ai/DeepSeek-V3.2

Text Generation • 685B • Updated Dec 1, 2025 • 10.3M • 1.41k
Anthropic/AnthropicInterviewer

Viewer • Updated Jan 6 • 1.25k • 1.49k • 367
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628
Qwen/Qwen-Image-Layered

Image-Text-to-Image • Updated Dec 19, 2025 • 22k • 1.05k

ArvutiMinisteerium

Llama 3.1

Collection

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 710
DeepSeek-R1

Collection

10 items • Updated Nov 27, 2025 • 843
Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 951
Qwen3

Collection

84 items • Updated Dec 31, 2025 • 1.75k

This collection is a list of papers I find to be very interesting.

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14, 2025 • 302
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 320
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4, 2025 • 213

Foundational & Modern AI Research (Curated)

A curated selection of foundational and modern AI research papers that meaningfully influence how real-world AI systems are designed, evaluated, and g

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 120
Scaling Laws for Neural Language Models

Paper • 2001.08361 • Published Jan 23, 2020 • 10
Training Compute-Optimal Large Language Models

Paper • 2203.15556 • Published Mar 29, 2022 • 11
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT

Paper • 2210.04186 • Published Oct 9, 2022

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628

peteromallet/dataclaw-peteromallet

Viewer • Updated Feb 25 • 549 • 646 • 298
Qwen/Qwen3.5-35B-A3B

Image-Text-to-Text • 36B • Updated Feb 27 • 3.89M • • 1.39k
Nanbeige/Nanbeige4.1-3B

Text Generation • 4B • Updated 25 days ago • 272k • • 1.09k
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628

Open-Source Foundations for Modern AI Systems

open-source libraries that form the infrastructure layer of modern AI systems, spanning model dev, retrieval, orchestration, evaluation, and MLOPS.

A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale

Paper • 2309.06497 • Published Sep 12, 2023 • 7
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 251
Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 951

<h1>test</test>

jjjjjjj%24%5C%7B123*456%5C%7Djjjjjjj%3C%25%3D123*567%25%3Ejjjjjjj%5C%7B%5C%7B123*678%5C%7D%5C%7D

deepseek-ai/DeepSeek-V3.2

Text Generation • 685B • Updated Dec 1, 2025 • 10.3M • 1.41k
Anthropic/AnthropicInterviewer

Viewer • Updated Jan 6 • 1.25k • 1.49k • 367
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628
Qwen/Qwen-Image-Layered

Image-Text-to-Image • Updated Dec 19, 2025 • 22k • 1.05k

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 50
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 513
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24

ArvutiMinisteerium

Llama 3.1

Collection

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 710
DeepSeek-R1

Collection

10 items • Updated Nov 27, 2025 • 843
Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 951
Qwen3

Collection

84 items • Updated Dec 31, 2025 • 1.75k

the code and files for a mock credit card dashboard application, including static HTML, CSS, and JavaScript files, as well as sample date files.

khabirc2/my-credit-card-dashboard

Updated Sep 30, 2025
google/embeddinggemma-300m

Sentence Similarity • 0.3B • Updated Sep 25, 2025 • 945k • • 1.61k
Qwen/Qwen3-Next-80B-A3B-Instruct

Text Generation • 81B • Updated Sep 17, 2025 • 304k • • 1.01k
HuggingFaceFW/finepdfs

Viewer • Updated 16 days ago • 476M • 21k • 849

This collection is a list of papers I find to be very interesting.

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14, 2025 • 302
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 320
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4, 2025 • 213

Previous
1
2
3
...
23
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs