Xiangyi Li's picture

Xiangyi Li PRO

xdotli

·

https://www.xiangyi.li

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

Graph of Skills: Dependency-Aware Structural Retrieval for Massive Agent Skills

upvoted a paper 5 days ago

ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces

authored a paper 5 days ago

ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces

View all activity

Organizations

upvoted a paper 4 days ago

Graph of Skills: Dependency-Aware Structural Retrieval for Massive Agent Skills

Paper • 2604.05333 • Published 7 days ago • 19

upvoted a paper 5 days ago

ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces

Paper • 2604.05172 • Published 8 days ago • 22

authored a paper 5 days ago

ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces

Paper • 2604.05172 • Published 8 days ago • 22

submitted a paper to Daily Papers 5 days ago

ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces

Paper • 2604.05172 • Published 8 days ago • 22

updated a dataset 6 days ago

benchflow/ClawsBench

Viewer • Updated 6 days ago • 7.83k • 85 • 1

published a dataset 6 days ago

benchflow/ClawsBench

Viewer • Updated 6 days ago • 7.83k • 85 • 1

liked a dataset 10 days ago

microsoft/ms_marco

Viewer • Updated Jan 4, 2024 • 1.11M • 22.4k • 239

New activity in harborframework/parity-experiments 11 days ago

Add SkillsBench parity experiment (gemini-cli, 3x70 tasks with skills)

#194 opened 21 days ago by

updated a dataset 11 days ago

xdotli/skillsbench-parity

Viewer • Updated 11 days ago • 380 • 5.67k

published a dataset 11 days ago

xdotli/skillsbench-parity

Viewer • Updated 11 days ago • 380 • 5.67k

upvoted a paper 12 days ago

RubricBench: Aligning Model-Generated Rubrics with Human Standards

Paper • 2603.01562 • Published Mar 2 • 63

New activity in harborframework/parity-experiments 21 days ago

Add SkillsBench parity experiment (full data at xdotli/skillsbench-parity)

#163 opened 21 days ago by

SkillsBench parity data (4/174)

#167 opened 21 days ago by

SkillsBench parity data (5/174)

#168 opened 21 days ago by

SkillsBench parity data (6/174)

#169 opened 21 days ago by

SkillsBench parity data (7/174)

#170 opened 21 days ago by

SkillsBench parity data (8/174)

#171 opened 21 days ago by

SkillsBench parity data (9/174)

#172 opened 21 days ago by

SkillsBench parity data (9/174)

#173 opened 21 days ago by

SkillsBench parity data (10/174)

#174 opened 21 days ago by