3 18 2

William Li

williamium

https://williamium3000.github.io/

AI & ML interests

Learning AI efficiently and robustly in interactive, multi-modal and embodied environment.

Recent Activity

updated a model about 23 hours ago

williamium/deepeyes

upvoted a paper about 24 hours ago

Can Vision Language Models Infer Human Gaze Direction? A Controlled Study

published a model 3 days ago

williamium/deepeyes

View all activity

Organizations

updated a model about 23 hours ago

williamium/deepeyes

Updated about 23 hours ago

upvoted a paper about 24 hours ago

Can Vision Language Models Infer Human Gaze Direction? A Controlled Study

Paper • 2506.05412 • Published Jun 4, 2025 • 5

published a model 3 days ago

williamium/deepeyes

Updated about 23 hours ago

upvoted a paper 5 days ago

Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning

Paper • 2604.04746 • Published 6 days ago • 66

updated a model 7 days ago

williamium/indirect_caption

Updated 7 days ago

upvoted a paper 20 days ago

ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model

Paper • 2603.22281 • Published 21 days ago • 17

upvoted a paper 21 days ago

VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding

Paper • 2603.22285 • Published 21 days ago • 49

updated a model about 1 month ago

williamium/AdaptRL

Updated Mar 11

updated a dataset about 1 month ago

williamium/embodied-video-rl

Viewer • Updated Mar 7 • 1.36k • 31

published a dataset about 1 month ago

williamium/embodied-video-rl

Viewer • Updated Mar 7 • 1.36k • 31

published 3 models about 1 month ago

upvoted an article about 2 months ago

Article

Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty?

Feb 25

•

upvoted a paper about 2 months ago

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 519

published a dataset about 2 months ago

phy-gen/counterfactual-physics

Viewer • Updated 2 days ago • 27.8k • 7.7k

upvoted 2 papers about 2 months ago

SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

Paper • 2602.12670 • Published Feb 13 • 59

Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs

Paper • 2602.10388 • Published Feb 11 • 244

updated a dataset about 2 months ago

mm-eval/MMBench-en-V11

Viewer • Updated 19 days ago • 7.3k • 32

authored a paper 2 months ago

CoMAS: Co-Evolving Multi-Agent Systems via Interaction Rewards

Paper • 2510.08529 • Published Oct 9, 2025 • 19

William Li

AI & ML interests

Recent Activity

Organizations

williamium's activity

Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty?