Open to Work

6 8

D B PRO

d-s-b

AI & ML interests

Exploring

Recent Activity

upvoted an article 9 days ago

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

liked a model 15 days ago

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled

liked a Space about 1 month ago

HuggingFaceFW/finephrase

View all activity

Organizations

upvoted an article 9 days ago

Article

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

Jan 27

•

liked a model 15 days ago

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled

Image-Text-to-Text • 28B • Updated 9 days ago • 589k • 2.64k

liked a Space about 1 month ago

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

📝

219

Explore synthetic data experiments on a virtual bookshelf

upvoted an article 2 months ago

Article

Optimization story: Bloom inference

Oct 12, 2022

•

liked a model 2 months ago

mistralai/Voxtral-Mini-4B-Realtime-2602

Automatic Speech Recognition • 4B • Updated Mar 11 • 864k • 817

upvoted 4 articles 5 months ago

Article

KV Cache from scratch in nanoVLM

Jun 4, 2025

•

115

Article

Mastering Tensor Dimensions in Transformers

Jan 12, 2025

•

160

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

Jan 30, 2025

•

293

Article

Continuous batching from first principles

Nov 25, 2025

•

357

updated a model 5 months ago

d-s-b/Qwen-3-0.6-medical

Updated Nov 25, 2025

published a model 5 months ago

d-s-b/Qwen-3-0.6-medical

Updated Nov 25, 2025

liked 3 Spaces 5 months ago

FineWeb: decanting the web for the finest text data at scale

🍷

1.33k

Read a detailed overview of the FineWeb web‑scale text dataset

The Ultra-Scale Playbook

🌌

3.78k

The ultimate guide to training LLM on large GPU Clusters

The Smol Training Playbook

📚

3.1k

The secrets to building world-class LLMs

updated a model 6 months ago

d-s-b/gemma-270m-gsm8k

Text Generation • 0.3B • Updated Oct 30, 2025 • 2

published a model 6 months ago

d-s-b/gemma-270m-gsm8k

Text Generation • 0.3B • Updated Oct 30, 2025 • 2

updated a model 8 months ago

d-s-b/meme

Updated Aug 30, 2025

liked a model 8 months ago

Qwen/Qwen-Image-Edit

Image-to-Image • Updated Aug 25, 2025 • 78.8k • • 2.37k

published a model 8 months ago

d-s-b/meme

Updated Aug 30, 2025

updated a dataset 8 months ago

d-s-b/MemeDataset

Viewer • Updated Aug 30, 2025 • 300 • 5

D B PRO

AI & ML interests

Recent Activity

Organizations

d-s-b's activity

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

Optimization story: Bloom inference

KV Cache from scratch in nanoVLM

Mastering Tensor Dimensions in Transformers

KV Caching Explained: Optimizing Transformer Inference Efficiency

Continuous batching from first principles

FineWeb: decanting the web for the finest text data at scale

The Ultra-Scale Playbook

The Smol Training Playbook