20 32

xy.r

ShawnRu

R10836

AI & ML interests

LLMs, Agents, any AI

Recent Activity

upvoted a paper 6 days ago

SkillX: Automatically Constructing Skill Knowledge Bases for Agents

upvoted a paper 6 days ago

LightThinker++: From Reasoning Compression to Memory Management

liked a dataset about 1 month ago

futurehouse/lab-bench

View all activity

Organizations

upvoted 2 papers 6 days ago

SkillX: Automatically Constructing Skill Knowledge Bases for Agents

Paper • 2604.04804 • Published 8 days ago • 31

LightThinker++: From Reasoning Compression to Memory Management

Paper • 2604.03679 • Published 10 days ago • 33

liked a dataset about 1 month ago

futurehouse/lab-bench

Viewer • Updated Sep 27, 2025 • 1.97k • 9.21k • 46

upvoted a paper about 1 month ago

SkillNet: Create, Evaluate, and Connect AI Skills

Paper • 2603.04448 • Published Feb 26 • 93

liked a dataset about 1 month ago

tencent/CL-bench

Viewer • Updated Feb 6 • 1.9k • 1.28k • 141

upvoted a paper about 1 month ago

How Controllable Are Large Language Models? A Unified Evaluation across Behavioral Granularities

Paper • 2603.02578 • Published Mar 3 • 25

upvoted 3 papers 3 months ago

Aligning Agentic World Models via Knowledgeable Experience Learning

Paper • 2601.13247 • Published Jan 19 • 15

Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency

Paper • 2601.05905 • Published Jan 9 • 20

Can We Predict Before Executing Machine Learning Agents?

Paper • 2601.05930 • Published Jan 9 • 28

liked a model 3 months ago

deepseek-ai/DeepSeek-V3.2

Text Generation • 685B • Updated Dec 1, 2025 • 6.41M • 1.39k

liked a dataset 4 months ago

cais/hle

Benchmark • Updated Jan 20 • 2.5k • 46.3k • 768

upvoted a paper 4 months ago

InnoGym: Benchmarking the Innovation Potential of AI Agents

Paper • 2512.01822 • Published Dec 1, 2025 • 36

liked 2 datasets 5 months ago

math-ai/aime25

Viewer • Updated Jan 19 • 30 • 49.9k • 33

HuggingFaceH4/MATH-500

Viewer • Updated Dec 15, 2025 • 500 • 127k • 293

liked a model 6 months ago

Skywork/Skywork-Reward-Llama-3.1-8B-v0.2

Text Classification • 8B • Updated Oct 25, 2024 • 86.7k • 42

upvoted 2 papers 6 months ago

LightMem: Lightweight and Efficient Memory-Augmented Generation

Paper • 2510.18866 • Published Oct 21, 2025 • 115

Executable Knowledge Graphs for Replicating AI Research

Paper • 2510.17795 • Published Oct 20, 2025 • 15

updated a dataset 6 months ago

zjunlp/OceanGym

Updated Oct 13, 2025 • 624 • 2

upvoted an article 6 months ago

Article

🛠 ML-Agents Tips & Lessons Learned (AutoMind + MLE-Bench)

Oct 9, 2025

•

liked a dataset 6 months ago

google/IFEval

Viewer • Updated Aug 14, 2024 • 541 • 87.3k • 145

xy.r

AI & ML interests

Recent Activity

Organizations

ShawnRu's activity

🛠 ML-Agents Tips & Lessons Learned (AutoMind + MLE-Bench)