223 448

dfuhoiysOHSVFh82934gfjklb

huba-buba

AI & ML interests

None yet

Recent Activity

upvoted a paper 14 days ago

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

liked a model 29 days ago

unsloth/Qwen3-Coder-Next-GGUF

liked a model about 1 month ago

Qwen/Qwen3.5-4B

View all activity

Organizations

None yet

upvoted a paper 14 days ago

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Paper • 2604.02029 • Published 16 days ago • 140

upvoted 2 papers about 1 month ago

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published Mar 10 • 151

Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Paper • 2512.24873 • Published Dec 31, 2025 • 108

upvoted a paper about 2 months ago

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published Feb 11 • 220

upvoted an article about 2 months ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

Dec 1, 2025

•

309

upvoted 3 papers about 2 months ago

upvoted 2 papers 2 months ago

GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

Paper • 2602.12099 • Published Feb 12 • 61

Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs

Paper • 2602.10388 • Published Feb 11 • 244

upvoted an article 2 months ago

Article

Forge: Scalable Agent RL Framework and Algorithm

Feb 13

•

147

upvoted 8 papers 2 months ago

Weak-Driven Learning: How Weak Agents make Strong Agents Stronger

Paper • 2602.08222 • Published Feb 9 • 290

AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents

Paper • 2602.06855 • Published Feb 6 • 83

QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining

Paper • 2602.07085 • Published Feb 6 • 190

F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare

Paper • 2602.06717 • Published Feb 6 • 74

WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

Paper • 2602.04634 • Published Feb 4 • 99

Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations

Paper • 2602.05885 • Published Feb 5 • 28

Reinforcement World Model Learning for LLM-based Agents

Paper • 2602.05842 • Published Feb 5 • 27

No One-Size-Fits-All: Building Systems For Translation to Bashkir, Kazakh, Kyrgyz, Tatar and Chuvash Using Synthetic And Original Data

Paper • 2602.04442 • Published Feb 4 • 3

upvoted an article 2 months ago

Article

🐯 Liger GRPO meets TRL

May 25, 2025

•

dfuhoiysOHSVFh82934gfjklb

AI & ML interests

Recent Activity

Organizations

huba-buba's activity

Transformers v5: Simple model definitions powering the AI ecosystem

Forge: Scalable Agent RL Framework and Algorithm

🐯 Liger GRPO meets TRL