Jieyu Zhang's picture

Jieyu Zhang

jieyuz2

·

https://jieyuz2.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

You Only Judge Once: Multi-response Reward Modeling in a Single Forward Pass

upvoted a paper 4 days ago

WildDet3D: Scaling Promptable 3D Detection in the Wild

upvoted a paper 7 days ago

MolmoWeb: Open Visual Web Agent and Open Data for the Open Web

View all activity

Organizations

upvoted a paper 3 days ago

You Only Judge Once: Multi-response Reward Modeling in a Single Forward Pass

Paper • 2604.10966 • Published 5 days ago • 9

upvoted a paper 4 days ago

WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published 9 days ago • 237

upvoted a paper 7 days ago

MolmoWeb: Open Visual Web Agent and Open Data for the Open Web

Paper • 2604.08516 • Published 9 days ago • 41

upvoted a collection 10 days ago

WildDet3D

This is the collection of WildDet3D artifacts, including demos, model checkpoints and data. https://github.com/allenai/WildDet3D • 8 items • Updated 5 days ago • 17

upvoted 2 papers 21 days ago

VFIG: Vectorizing Complex Figures in SVG with Vision-Language Models

Paper • 2603.24575 • Published 23 days ago • 18

Synthetic Visual Genome 2: Extracting Large-scale Spatio-Temporal Scene Graphs from Videos

Paper • 2602.23543 • Published Feb 26 • 9

upvoted a paper 2 months ago

Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?

Paper • 2602.07055 • Published Feb 4 • 23

upvoted 2 collections 4 months ago

Molmo2

Artifacts for the Molmo2 release • 5 items • Updated Mar 2 • 36

PixMo

A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog • 9 items • Updated Mar 2 • 88

upvoted 2 collections 5 months ago

Olmo 3

Artifacts for the Olmo 3 release. • 7 items • Updated Mar 2 • 168

Molmo2 Data

Artifacts for the Molmo2 data release • 13 items • Updated Mar 2 • 39

upvoted a paper 6 months ago

Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning

Paper • 2510.23473 • Published Oct 27, 2025 • 86

upvoted 2 collections 8 months ago

MolmoAct Data Mixture

All datasets for the MolmoAct (Multimodal Open Language Model for Action) release. • 4 items • Updated Dec 23, 2025 • 18

MolmoAct

All models for the MolmoAct (Multimodal Open Language Model for Action) release. • 10 items • Updated Dec 23, 2025 • 35

upvoted a paper 12 months ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14, 2025 • 308

upvoted a paper about 1 year ago

On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

Paper • 2502.14296 • Published Feb 20, 2025 • 45

upvoted 2 collections over 1 year ago

TACO Models

This collection contains the best-performing TACO models based on LLaMA-3/Qwen2 and SigLIP/CLIP. • 3 items • Updated Oct 31, 2025 • 8

CoTA Datasets

This collection contains all versions of the CoTA (Chain-of-Thought-and-Action) datasets. • 4 items • Updated Mar 2 • 7

upvoted a paper over 1 year ago

ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models

Paper • 2412.07012 • Published Dec 9, 2024 • 1

upvoted a collection over 1 year ago

TaskMeAnything

A collection of TaskMeAnything resources [https://github.com/JieyuZ2/TaskMeAnything] • 7 items • Updated Mar 2 • 3