24 59 29

Yuansheng Ni

yuanshengni

https://yuanshengni.github.io/

AI & ML interests

NLP

Recent Activity

upvoted a paper 7 days ago

ClawBench: Can AI Agents Complete Everyday Online Tasks?

upvoted a paper 9 days ago

Watch Before You Answer: Learning from Visually Grounded Post-Training

upvoted a paper 10 days ago

SWE-Next: Scalable Real-World Software Engineering Tasks for Agents

View all activity

Organizations

upvoted a paper 7 days ago

ClawBench: Can AI Agents Complete Everyday Online Tasks?

Paper • 2604.08523 • Published 9 days ago • 255

upvoted a paper 9 days ago

Watch Before You Answer: Learning from Visually Grounded Post-Training

Paper • 2604.05117 • Published 12 days ago • 35

upvoted a paper 10 days ago

SWE-Next: Scalable Real-World Software Engineering Tasks for Agents

Paper • 2603.20691 • Published 28 days ago • 10

upvoted a paper 22 days ago

CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents

Paper • 2603.24440 • Published 23 days ago • 96

upvoted a collection 22 days ago

OpenResearcher

Collection

OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis • 8 items • Updated 24 days ago • 17

upvoted a paper 22 days ago

OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis

Paper • 2603.20278 • Published about 1 month ago • 94

upvoted 2 papers about 2 months ago

VisPhyWorld: Probing Physical Reasoning via Code-Driven Video Reconstruction

Paper • 2602.13294 • Published Feb 9 • 13

InnoEval: On Research Idea Evaluation as a Knowledge-Grounded, Multi-Perspective Reasoning Problem

Paper • 2602.14367 • Published Feb 16 • 17

updated a dataset 2 months ago

MMMU/MMMU

Viewer • Updated Feb 12 • 11.6k • 72.4k • 325

upvoted a paper 2 months ago

Context Forcing: Consistent Autoregressive Video Generation with Long Context

Paper • 2602.06028 • Published Feb 5 • 36

New activity in MMMU/MMMU 3 months ago

wrong_use，need deleted

#6 opened 3 months ago by

Aros199

upvoted a paper 3 months ago

Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency

Paper • 2601.05905 • Published Jan 9 • 20

upvoted 2 papers 4 months ago

TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models

Paper • 2512.02014 • Published Dec 1, 2025 • 74

InnoGym: Benchmarking the Innovation Potential of AI Agents

Paper • 2512.01822 • Published Dec 1, 2025 • 36

updated 3 models 5 months ago

updated 2 datasets 5 months ago

TIGER-Lab/VisPlotBench

Viewer • Updated Nov 3, 2025 • 888 • 266 • 2

TIGER-Lab/VisCode-Multi-679K

Viewer • Updated Nov 3, 2025 • 679k • 110 • 7

updated a model 5 months ago

TIGER-Lab/VisCoder2-7B

Image-Text-to-Text • 8B • Updated Nov 3, 2025 • 56 • 5

Yuansheng Ni

AI & ML interests

Recent Activity

Organizations

yuanshengni's activity

wrong_use，need deleted