JiayuCHEN
KN33SOXXX
AI & ML interests
None yet
Recent Activity
updated a collection about 18 hours ago
WorldModel updated a collection 5 days ago
agent updated a collection 5 days ago
mutilmodal reasoningOrganizations
None yet
skill
-
Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills
Paper • 2603.25158 • Published • 50 -
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver
Paper • 2604.08377 • Published • 273 -
How Well Do Agentic Skills Work in the Wild: Benchmarking LLM Skill Usage in Realistic Settings
Paper • 2604.04323 • Published • 40 -
SkillX: Automatically Constructing Skill Knowledge Bases for Agents
Paper • 2604.04804 • Published • 31
WorldModel
-
WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG
Paper • 2603.23497 • Published • 91 -
INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling
Paper • 2604.07209 • Published • 35 -
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory
Paper • 2604.08995 • Published • 42
image_gen
mutilmodal reasoning
3dRES
-
F4Splat: Feed-Forward Predictive Densification for Feed-Forward 3D Gaussian Splatting
Paper • 2603.21304 • Published • 32 -
Repurposing Geometric Foundation Models for Multi-view Diffusion
Paper • 2603.22275 • Published • 47 -
2Xplat: Two Experts Are Better Than One Generalist
Paper • 2603.21064 • Published • 25 -
One View Is Enough! Monocular Training for In-the-Wild Novel View Generation
Paper • 2603.23488 • Published • 4
GUIAgent
agent
-
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models
Paper • 2601.22060 • Published • 155 -
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models
Paper • 2602.02185 • Published • 118 -
SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning
Paper • 2603.23483 • Published • 62 -
WorldAgents: Can Foundation Image Models be Agents for 3D World Models?
Paper • 2603.19708 • Published • 13
RL
mutilmodal reasoning
skill
-
Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills
Paper • 2603.25158 • Published • 50 -
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver
Paper • 2604.08377 • Published • 273 -
How Well Do Agentic Skills Work in the Wild: Benchmarking LLM Skill Usage in Realistic Settings
Paper • 2604.04323 • Published • 40 -
SkillX: Automatically Constructing Skill Knowledge Bases for Agents
Paper • 2604.04804 • Published • 31
3dRES
-
F4Splat: Feed-Forward Predictive Densification for Feed-Forward 3D Gaussian Splatting
Paper • 2603.21304 • Published • 32 -
Repurposing Geometric Foundation Models for Multi-view Diffusion
Paper • 2603.22275 • Published • 47 -
2Xplat: Two Experts Are Better Than One Generalist
Paper • 2603.21064 • Published • 25 -
One View Is Enough! Monocular Training for In-the-Wild Novel View Generation
Paper • 2603.23488 • Published • 4
WorldModel
-
WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG
Paper • 2603.23497 • Published • 91 -
INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling
Paper • 2604.07209 • Published • 35 -
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory
Paper • 2604.08995 • Published • 42
GUIAgent
image_gen
agent
-
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models
Paper • 2601.22060 • Published • 155 -
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models
Paper • 2602.02185 • Published • 118 -
SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning
Paper • 2603.23483 • Published • 62 -
WorldAgents: Can Foundation Image Models be Agents for 3D World Models?
Paper • 2603.19708 • Published • 13