9 29 155

Xie

Zhihui

https://zhxie.site/

zhxieml

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents

liked a dataset 13 days ago

claw-eval/Claw-Eval

upvoted a paper 25 days ago

HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning

View all activity

Organizations

upvoted a paper 6 days ago

Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents

Paper • 2604.06132 • Published 7 days ago • 113

liked a dataset 13 days ago

claw-eval/Claw-Eval

Viewer • Updated 1 day ago • 300 • 1.28k • 11

upvoted a paper 25 days ago

HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning

Paper • 2603.17024 • Published 28 days ago • 109

liked a model about 2 months ago

Qwen/Qwen3.5-35B-A3B-Base

Image-Text-to-Text • 36B • Updated Mar 2 • 64.9k • 125

liked a dataset about 2 months ago

InternScience/SGI-Reasoning

Viewer • Updated Dec 30, 2025 • 291 • 461 • 6

upvoted a collection about 2 months ago

SGI-Bench

Collection

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows • 12 items • Updated 20 days ago • 33

liked a dataset 3 months ago

ellisbrown/SIMS-VSI

Viewer • Updated Nov 7, 2025 • 242k • 175 • 7

liked a model 4 months ago

EssentialAI/rnj-1-instruct

Text Generation • 8B • Updated Dec 24, 2025 • 7.17k • • 316

liked a Space 5 months ago

CUA - Computer Use Agent 2.0

🤖

139

Generate captions for images

liked a dataset 5 months ago

rl-research/dr-tulu-rl-data

Viewer • Updated Nov 25, 2025 • 4.88k • 1.28k • 12

liked a model 5 months ago

meituan-longcat/LongCat-Flash-Chat

Text Generation • 562B • Updated Sep 24, 2025 • 44k • 530

upvoted a paper 5 months ago

OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows

Paper • 2510.24411 • Published Oct 28, 2025 • 72

liked a dataset 5 months ago

zjunlp/DataMind-Data

Preview • Updated Oct 11, 2025 • 184 • 2

upvoted a paper 5 months ago

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

Paper • 2510.25726 • Published Oct 29, 2025 • 46

liked a dataset 6 months ago

neulab/agent-data-collection

Preview • Updated Mar 9 • 3.84k • 111

liked a model 6 months ago

MiniMaxAI/MiniMax-M2

Text Generation • 229B • Updated Dec 23, 2025 • 58.4k • • 1.49k

upvoted 3 papers 6 months ago

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Paper • 2510.08240 • Published Oct 9, 2025 • 41

Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense

Paper • 2510.07242 • Published Oct 8, 2025 • 30

RESTRAIN: From Spurious Votes to Signals -- Self-Driven RL with Self-Penalization

Paper • 2510.02172 • Published Oct 2, 2025 • 7

liked a dataset 7 months ago

Dream-org/Dream-Coder-RL-17k

Viewer • Updated Aug 6, 2025 • 17k • 31 • 5

Xie

AI & ML interests

Recent Activity

Organizations

Zhihui's activity

CUA - Computer Use Agent 2.0