4 12 1

Xiao-Ming Wu

DravenALG

https://dravenalg.github.io/

AI & ML interests

Deep Learning, Computer Vision, Embodied AI

Recent Activity

updated a model 10 days ago

DravenALG/real-world-checkpoint

upvoted a paper 13 days ago

HippoCamp: Benchmarking Contextual Agents on Personal Computers

updated a model 18 days ago

DravenALG/VLANeXt

View all activity

Organizations

None yet

upvoted a paper 13 days ago

HippoCamp: Benchmarking Contextual Agents on Personal Computers

Paper • 2604.01221 • Published 15 days ago • 29

upvoted an article about 2 months ago

Article

SmolVLM - small yet mighty Vision Language Model

Nov 26, 2024

•

417

upvoted 3 papers about 2 months ago

Estimator Meets Equilibrium Perspective: A Rectified Straight Through Estimator for Binary Neural Networks Training

Paper • 2308.06689 • Published Aug 13, 2023 • 1

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 519

VLANeXt: Recipes for Building Strong VLA Models

Paper • 2602.18532 • Published Feb 20 • 52

upvoted a paper 3 months ago

DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation

Paper • 2601.22153 • Published Jan 29 • 74

upvoted 2 papers 4 months ago

ProEdit: Inversion-based Editing From Prompts Done Right

Paper • 2512.22118 • Published Dec 26, 2025 • 18

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Paper • 2512.19693 • Published Dec 22, 2025 • 67

upvoted a paper 5 months ago

Architecture Decoupling Is Not All You Need For Unified Multimodal Model

Paper • 2511.22663 • Published Nov 27, 2025 • 29

upvoted 2 papers 6 months ago

From Pixels to Words -- Towards Native Vision-Language Primitives at Scale

Paper • 2510.14979 • Published Oct 16, 2025 • 69

Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

Paper • 2510.08673 • Published Oct 9, 2025 • 127

upvoted a paper 8 months ago

Next Visual Granularity Generation

Paper • 2508.12811 • Published Aug 18, 2025 • 49

Xiao-Ming Wu

AI & ML interests

Recent Activity

Organizations

DravenALG's activity

SmolVLM - small yet mighty Vision Language Model