VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images Paper • 2604.09531 • Published 8 days ago • 8
RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details Paper • 2604.06870 • Published 10 days ago • 40