6 15

Jadon

jadodev

phase

AI & ML interests

Machine Learning, Programming Language Theory, Category Theory, Quantum Computing

Recent Activity

liked a model 12 days ago

nvidia/Gemma-4-31B-IT-NVFP4

liked a model 14 days ago

tencent/Sequential-Hidden-Decoding-8B-n8-Instruct

upvoted a paper 16 days ago

Virtual Width Networks

View all activity

Organizations

None yet

liked a model 12 days ago

nvidia/Gemma-4-31B-IT-NVFP4

Text Generation • 21B • Updated 1 day ago • 828k • 381

liked a model 14 days ago

tencent/Sequential-Hidden-Decoding-8B-n8-Instruct

Text Generation • 13B • Updated 14 days ago • 41 • 7

upvoted a paper 16 days ago

Virtual Width Networks

Paper • 2511.11238 • Published Nov 14, 2025 • 39

liked a model 16 days ago

ByteDance/Ouro-1.4B

Text Generation • Updated Jan 18 • 25.6k • 83

liked a Space 16 days ago

The Smol Training Playbook

📚

3.1k

The secrets to building world-class LLMs

liked a model 17 days ago

HuggingFaceTB/FineMath-Llama-3B

3B • Updated Nov 27, 2025 • 85 • 22

liked a dataset 17 days ago

HuggingFaceTB/finemath

Viewer • Updated Feb 6, 2025 • 48.3M • 15.1k • 358

liked 2 datasets 18 days ago

tiiuae/falcon-refinedweb

Viewer • Updated Jun 20, 2023 • 968M • 41.9k • 904

allenai/c4

Viewer • Updated Jan 9, 2024 • 10.4B • 627k • 546

upvoted a paper 5 months ago

Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset

Paper • 2508.15096 • Published Aug 20, 2025 • 8

liked a model about 1 year ago

deepseek-ai/DeepSeek-V3-0324

Text Generation • 685B • Updated Mar 27, 2025 • 556k • • 3.1k

updated a collection about 2 years ago

transformer

Collection

2 items • Updated Apr 7, 2024

upvoted a paper about 2 years ago

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 107

liked 2 models about 2 years ago

mlabonne/phixtral-4x2_8

Text Generation • Updated Jan 15, 2024 • 72 • 209

NousResearch/Nous-Hermes-2-Mistral-7B-DPO

Text Generation • 7B • Updated Apr 30, 2024 • 2.19k • 217

upvoted a paper about 2 years ago

Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7, 2024 • 48

updated a collection about 2 years ago

transformer

Collection

2 items • Updated Apr 7, 2024

upvoted a paper about 2 years ago

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 190

liked a model about 2 years ago

HuggingFaceH4/zephyr-7b-alpha

Text Generation • 7B • Updated Oct 16, 2024 • 5.46k • • 1.12k

Jadon

AI & ML interests

Recent Activity

Organizations

jadodev's activity

The Smol Training Playbook