7 40 8

Qihan Ren

jasonrqh

https://nebularaid2000.github.io/

AI & ML interests

explainable AI, LLM

Recent Activity

upvoted a paper 2 days ago

Seedance 2.0: Advancing Video Generation for World Complexity

new activity 3 days ago

Jackrong/Gemopus-4-31B-it:Awesome work

liked a model 3 days ago

Jackrong/Gemopus-4-31B-it-GGUF

View all activity

Organizations

upvoted a paper 2 days ago

Seedance 2.0: Advancing Video Generation for World Complexity

Paper • 2604.14148 • Published 3 days ago • 135

upvoted 2 papers 3 days ago

Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization

Paper • 2604.09574 • Published Feb 24 • 30

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published 4 days ago • 77

upvoted an article 7 days ago

Article

Red Teaming with RL: Exploiting Tinker API for Harmful RL on 235B Model

Jan 1

•

upvoted 3 papers 7 days ago

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

Paper • 2604.06628 • Published 10 days ago • 316

Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

Paper • 2603.24472 • Published 24 days ago • 53

On the Role of Reasoning Patterns in the Generalization Discrepancy of Long Chain-of-Thought Supervised Fine-Tuning

Paper • 2604.01702 • Published 14 days ago • 3

upvoted a paper 8 days ago

SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds

Paper • 2604.08544 • Published 9 days ago • 16

upvoted a paper 9 days ago

ATBench: A Diverse and Realistic Trajectory Benchmark for Long-Horizon Agent Safety

Paper • 2604.02022 • Published 16 days ago • 15

upvoted a paper 10 days ago

Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents

Paper • 2604.06132 • Published 11 days ago • 114

upvoted a collection 11 days ago

Rethink_SFT_generalization

Collection

Repo for paper Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability. • 40 items • Updated 7 days ago • 16

upvoted a paper 17 days ago

Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development

Paper • 2603.27460 • Published 20 days ago • 68

upvoted 3 papers about 1 month ago

Attention Residuals

Paper • 2603.15031 • Published Mar 16 • 180

OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data

Paper • 2603.15594 • Published Mar 16 • 149

Code2Math: Can Your Code Agent Effectively Evolve Math Problems Through Exploration?

Paper • 2603.03202 • Published Mar 3 • 17

upvoted 2 papers about 2 months ago

Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning

Paper • 2602.11149 • Published Feb 11 • 17

Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report v1.5

Paper • 2602.14457 • Published Feb 16 • 29

upvoted an article 2 months ago

Article

Forge: Scalable Agent RL Framework and Algorithm

Feb 13

•

147

upvoted a paper 2 months ago

Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations

Paper • 2602.05885 • Published Feb 5 • 28

upvoted a paper 3 months ago

ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation

Paper • 2601.21420 • Published Jan 29 • 42