19 6

Luca Zhang

ZHHJemotion

AI & ML interests

VLM and MLLM

Organizations

None yet

upvoted a paper 3 months ago

The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

Paper • 2601.15165 • Published Jan 21 • 73

upvoted 2 papers 6 months ago

Diffusion Transformers with Representation Autoencoders

Paper • 2510.11690 • Published Oct 13, 2025 • 170

IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance

Paper • 2509.26231 • Published Sep 30, 2025 • 18

upvoted a paper 8 months ago

DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 305

upvoted 2 papers 9 months ago

HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Paper • 2507.21809 • Published Jul 29, 2025 • 142

MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21, 2025 • 98

upvoted an article 9 months ago

Article

SigLIP 2: A better multilingual vision language encoder

Feb 21, 2025

•

210

upvoted 3 papers 10 months ago

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22, 2025 • 122

LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion

Paper • 2507.02813 • Published Jul 3, 2025 • 60

Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models

Paper • 2506.06395 • Published Jun 5, 2025 • 135

upvoted 2 articles 10 months ago

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Dec 9, 2022

•

410

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Feb 7, 2025

•

287

upvoted 2 papers 11 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2, 2025 • 190

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 191

upvoted 3 papers 12 months ago

Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published Apr 22, 2025 • 64

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

Paper • 2504.13820 • Published Apr 18, 2025 • 16

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18, 2025 • 141

upvoted a paper about 1 year ago

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Paper • 2503.12605 • Published Mar 16, 2025 • 35

upvoted a paper over 1 year ago

BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices

Paper • 2411.10640 • Published Nov 16, 2024 • 46

Luca Zhang

AI & ML interests

Organizations

ZHHJemotion's activity

SigLIP 2: A better multilingual vision language encoder

Illustrating Reinforcement Learning from Human Feedback (RLHF)

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge