mlpc-ucsd

university

https://pages.ucsd.edu/~ztu/

mlpc-ucsd

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

gordonhu authored a paper 6 days ago

BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions

gordonhu authored a paper 6 days ago

Matryoshka Query Transformer for Large Vision-Language Models

gordonhu authored a paper 6 days ago

MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models

View all activity

Papers

PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction

Pose Recognition with Cascade Transformers

View all Papers

gordonhu

authored 9 papers 6 days ago

BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions

Paper • 2308.09936 • Published Aug 19, 2023 • 1

TemMed-Bench: Evaluating Temporal Medical Image Reasoning in Vision-Language Models

Paper • 2509.25143 • Published Sep 29, 2025

ARES: Multimodal Adaptive Reasoning via Difficulty-Aware Token-Level Entropy Shaping

Paper • 2510.08457 • Published Oct 9, 2025 • 13

MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence

Paper • 2512.10863 • Published Dec 11, 2025 • 22

OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks

Paper • 2604.08539 • Published 7 days ago • 48

zx1239856

submitted a paper to Daily Papers about 1 month ago

PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction

Paper • 2603.05888 • Published Mar 6 • 2

zwcolin

authored 7 papers 3 months ago

Language Models Meet World Models: Embodied Experiences Enhance Language Models

Paper • 2305.10626 • Published May 18, 2023 • 1

Language Models as Science Tutors

Paper • 2402.11111 • Published Feb 16, 2024

On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning

Paper • 2210.10763 • Published Oct 19, 2022 • 1

OmniControlNet: Dual-stage Integration for Conditional Image Generation

Paper • 2406.05871 • Published Jun 9, 2024

YOLO-Count: Differentiable Object Counting for Text-to-Image Generation

Paper • 2508.00728 • Published Aug 1, 2025

FrontierCS: Evolving Challenges for Evolving Intelligence

Paper • 2512.15699 • Published Dec 17, 2025 • 5

VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents

Paper • 2601.16973 • Published Jan 23 • 40

zwcolin

submitted a paper to Daily Papers 3 months ago

VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents

Paper • 2601.16973 • Published Jan 23 • 40

gordonhu

authored a paper 5 months ago

G$^2$VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning

Paper • 2511.21688 • Published Nov 26, 2025 • 8

JamesSand

authored a paper over 1 year ago

On Computational Limits and Provably Efficient Criteria of Visual Autoregressive Models: A Fine-Grained Complexity Analysis

Paper • 2501.04377 • Published Jan 8, 2025 • 14

AI & ML interests

Recent Activity

Papers

Team members 7

mlpc-lab's activity