Progressive Prompt Detailing for Improved Alignment in Text-to-Image Generative Models Paper โข 2503.17794 โข Published Mar 22, 2025
Some Modalities are More Equal Than Others: Decoding and Architecting Multimodal Integration in MLLMs Paper โข 2511.22826 โข Published Nov 28, 2025 โข 8
Generative Action Tell-Tales: Assessing Human Motion in Synthesized Videos Paper โข 2512.01803 โข Published Dec 1, 2025 โข 5
Semantic Richness or Geometric Reasoning? The Fragility of VLM's Visual Invariance Paper โข 2604.01848 โข Published 11 days ago
What's in a Latent? Leveraging Diffusion Latent Space for Domain Generalization Paper โข 2503.06698 โข Published Mar 9, 2025 โข 4
What's in a Latent? Leveraging Diffusion Latent Space for Domain Generalization Paper โข 2503.06698 โข Published Mar 9, 2025 โข 4 โข 2
$\textit{Revelio}$: Interpreting and leveraging semantic information in diffusion models Paper โข 2411.16725 โข Published Nov 23, 2024 โข 1