7 69 140

Pu Fanyi

pufanyi

https://pufanyi.github.io

AI & ML interests

Recent Activity

liked a model 6 days ago

google/umt5-xxl

upvoted a paper 10 days ago

FileGram: Grounding Agent Personalization in File-System Behavioral Traces

upvoted a paper 12 days ago

HippoCamp: Benchmarking Contextual Agents on Personal Computers

View all activity

Organizations

upvoted a paper 10 days ago

FileGram: Grounding Agent Personalization in File-System Behavioral Traces

Paper • 2604.04901 • Published 12 days ago • 40

upvoted a paper 12 days ago

HippoCamp: Benchmarking Contextual Agents on Personal Computers

Paper • 2604.01221 • Published 16 days ago • 29

upvoted a paper 29 days ago

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

Paper • 2603.15726 • Published Mar 16 • 185

upvoted 2 papers about 1 month ago

Demystifing Video Reasoning

Paper • 2603.16870 • Published about 1 month ago • 369

HSImul3R: Physics-in-the-Loop Reconstruction of Simulation-Ready Human-Scene Interactions

Paper • 2603.15612 • Published Mar 16 • 152

upvoted an article about 1 month ago

Article

NEO-unify: Building Native Multimodal Unified Models End to End

Mar 5

•

125

upvoted 3 papers about 1 month ago

ArtHOI: Articulated Human-Object Interaction Synthesis by 4D Reconstruction from Video Priors

Paper • 2603.04338 • Published Mar 4 • 24

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 519

UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?

Paper • 2603.03241 • Published Mar 3 • 87

upvoted a paper about 2 months ago

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

Paper • 2602.12279 • Published Feb 12 • 20

upvoted a paper 2 months ago

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published Feb 9 • 52

upvoted a collection 3 months ago

NEO1_0

Collection

From Pixels to Words -- Towards Native Vision-Language Primitives at Scale • 7 items • Updated Jan 27 • 9

upvoted a paper 3 months ago

Fewer Truncations Improve Language Modeling

Paper • 2404.10830 • Published Apr 16, 2024 • 5

upvoted 4 papers 4 months ago

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Paper • 2512.19693 • Published Dec 22, 2025 • 67

upvoted 2 papers 5 months ago

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

Paper • 2511.20785 • Published Nov 25, 2025 • 189

CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning

Paper • 2511.18659 • Published Nov 24, 2025 • 25

upvoted a collection 5 months ago

MDGA