Westlake University

university

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Papers

Self-Adversarial One Step Generation via Condition Shifting

MMaDA-VLA: Large Diffusion Vision-Language-Action Model with Unified Multi-Modal Instruction and Generation

View all Papers

submitted a paper to Daily Papers 2 months ago

Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning

Paper • 2602.11748 • Published Feb 12 • 37

authored 2 papers 2 months ago

HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?

Paper • 2509.07894 • Published Sep 9, 2025 • 32

P1: Mastering Physics Olympiads with Reinforcement Learning

Paper • 2511.13612 • Published Nov 17, 2025 • 134

Huan-WhoRegisteredMyName

authored a paper 6 months ago

OBS-Diff: Accurate Pruning For Diffusion Models in One-Shot

Paper • 2510.06751 • Published Oct 8, 2025 • 22

Huan-WhoRegisteredMyName

authored a paper 7 months ago

RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning

Paper • 2510.02240 • Published Oct 2, 2025 • 18

Huan-WhoRegisteredMyName

authored a paper 8 months ago

When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios

Paper • 2507.20198 • Published Jul 27, 2025 • 28

Huan-WhoRegisteredMyName

authored 4 papers 11 months ago

Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models

Paper • 2503.16257 • Published Mar 20, 2025 • 28

Is Oracle Pruning the True Oracle?

Paper • 2412.00143 • Published Nov 28, 2024 • 3

HoliTom: Holistic Token Merging for Fast Video Large Language Models

Paper • 2505.21334 • Published May 27, 2025 • 21

Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps

Paper • 2505.18675 • Published May 24, 2025 • 27

Huan-WhoRegisteredMyName

authored 9 papers about 1 year ago

Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View

Paper • 2503.12553 • Published Mar 16, 2025 • 8

Image as Set of Points

Paper • 2303.01494 • Published Mar 2, 2023

Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution

Paper • 2303.09650 • Published Mar 16, 2023

Frame Flexible Network

Paper • 2303.14817 • Published Mar 26, 2023

What Makes a "Good" Data Augmentation in Knowledge Distillation -- A Statistical Perspective

Paper • 2012.02909 • Published Dec 5, 2020 • 1

R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis

Paper • 2203.17261 • Published Mar 31, 2022 • 1

Real-Time Neural Light Field on Mobile Devices

Paper • 2212.08057 • Published Dec 15, 2022

DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models

Paper • 2411.15024 • Published Nov 22, 2024 • 2

Autoregressive Image Generation with Randomized Parallel Decoding

Paper • 2503.10568 • Published Mar 13, 2025 • 9

Huan-WhoRegisteredMyName

authored a paper almost 3 years ago

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds

Paper • 2306.00980 • Published Jun 1, 2023 • 16