Hiring 💼

3 78 172

Cahlen Humphreys PRO

cahlen

https://bigcompute.science

AI & ML interests

☠️💻

Recent Activity

liked a Space about 18 hours ago

victor/sunset-racing-glm-5.1

liked a model 3 days ago

LilaRest/gemma-4-31B-it-NVFP4-turbo

reacted to anakin87's post with ❤️ 4 days ago

📣 I just published a free course on Reinforcement Learning Environments for Language Models! 📌 COURSE: https://github.com/anakin87/llm-rl-environments-lil-course Over the past year, we've seen a shift in LLM Post-Training. Previously, Supervised Fine-Tuning was the most important part: making models imitate curated Question-Answer pairs. Now we also have Reinforcement Learning with Verifiable Rewards. With techniques like GRPO, models can learn through trial and error in dynamic environments. They can climb to new heights without relying on expensively prepared data. But what actually are these environments in practice❓ And how do you build them effectively❓ Fascinated by these concepts, I spent time exploring this space through experiments, post-training Small Language Models. I've packaged everything I learned into this short course. What you'll learn 🔹 Agents, Environments, and LLMs: how to map Reinforcement Learning concepts to the LLM domain 🔹 How to use Verifiers (open-source library by Prime Intellect) to build RL environments as software artifacts 🔹 Common patterns: How to build single-turn, multi-turn, and tool-use environments 🔹 Hands-on: turn a small language model (LFM2-2.6B by LiquidAI) into a Tic Tac Toe master 🔸 Build the game Environment 🔸 Use it to generate synthetic data for SFT warm-up 🔸 Group-based Reinforcement Learning If you're interested in building "little worlds" where LLMs can learn, this course is for you. --- 🤗🕹️ Play against the trained model: https://huggingface.co/spaces/anakin87/LFM2-2.6B-mr-tictactoe 📚 HF collection (datasets + models): https://huggingface.co/collections/anakin87/lfm2-26b-mr-tic-tac-toe

View all activity

Organizations

liked a Space about 18 hours ago

Car Game

🏃

Play a fast 3D car racing game

liked a model 3 days ago

LilaRest/gemma-4-31B-it-NVFP4-turbo

Text Generation • 33B • Updated 4 days ago • 41.9k • 202

liked a model 4 days ago

dealignai/Gemma-4-31B-JANG_4M-CRACK

Image-Text-to-Text • 6B • Updated 4 days ago • 117k • 1.06k

liked a dataset 4 days ago

Crownelius/Opus-4.6-Reasoning-2100x-formatted

Viewer • Updated 21 days ago • 2.16k • 991 • 53

liked a model 5 days ago

zai-org/GLM-5.1

Text Generation • 754B • Updated 2 days ago • 84.8k • • 1.18k

liked a dataset 6 days ago

SWE-bench/SWE-bench_Verified

Benchmark • Updated Feb 27 • 500 • 138k • 30

liked 4 models 8 days ago

liked a dataset 8 days ago

ianncity/KIMI-K2.5-1000000x

Viewer • Updated 7 days ago • 733k • 2.98k • 199

liked 3 models 8 days ago

google/gemma-4-26B-A4B-it

Image-Text-to-Text • 27B • Updated 4 days ago • 2.06M • • 649

prism-ml/Bonsai-8B-gguf

Text Generation • 8B • Updated 7 days ago • 78.8k • 584

0xSero/gemma-4-21b-a4b-it-REAP

Text Generation • 21B • Updated 6 days ago • 4.48k • 84

liked a model 9 days ago

netflix/void-model

Video-to-Video • Updated 8 days ago • 802

liked a model 10 days ago

unsloth/gemma-4-26B-A4B-it-GGUF

Image-Text-to-Text • 25B • Updated 3 days ago • 1.92M • 464

liked 2 models 11 days ago

google/gemma-4-31B-it

Image-Text-to-Text • 33B • Updated 4 days ago • 2.64M • • 1.87k

google/gemma-4-E4B-it

Any-to-Any • 8B • Updated 4 days ago • 1.5M • 643

liked 2 models 12 days ago

Skywork/Skywork-OR1-Math-7B

8B • Updated May 29, 2025 • 38 • 15

Qwen/Qwen2.5-Math-7B

Text Generation • 8B • Updated Sep 23, 2024 • 138k • • 109

Cahlen Humphreys PRO

AI & ML interests

Recent Activity

Organizations

cahlen's activity

Car Game