-
Visual Representation Alignment for Multimodal Large Language Models
Paper • 2509.07979 • Published • 84 -
Language Models Can Learn from Verbal Feedback Without Scalar Rewards
Paper • 2509.22638 • Published • 70 -
Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
Paper • 2510.05034 • Published • 51 -
Think-Then-Generate: Reasoning-Aware Text-to-Image Diffusion with LLM Encoders
Paper • 2601.10332 • Published • 31
Jeffrey Van de zande
Sexhuis
·
AI & ML interests
None yet
Organizations
None yet
X1
-
Visual Representation Alignment for Multimodal Large Language Models
Paper • 2509.07979 • Published • 84 -
Language Models Can Learn from Verbal Feedback Without Scalar Rewards
Paper • 2509.22638 • Published • 70 -
Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
Paper • 2510.05034 • Published • 51 -
Think-Then-Generate: Reasoning-Aware Text-to-Image Diffusion with LLM Encoders
Paper • 2601.10332 • Published • 31
datasets 0
None public yet