Kaicheng Yang

Kaichengalex

https://kaichengyang0828.github.io/Kaicheng-Yang0828.github.io/

kaichengyang0828

AI & ML interests

Multimodal Representation Learning/ Vision-Language Pretraining/DeepResearch

Recent Activity

upvoted a paper 5 days ago

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

upvoted a paper 12 days ago

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

updated a dataset 19 days ago

DeepGlint-AI/DanQing100M

View all activity

Organizations

upvoted a paper 5 days ago

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

Paper • 2604.05015 • Published 7 days ago • 227

upvoted a paper 12 days ago

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Paper • 2603.27538 • Published 15 days ago • 137

upvoted 2 papers about 1 month ago

Proact-VL: A Proactive VideoLLM for Real-Time AI Companions

Paper • 2603.03447 • Published Mar 3 • 37

Phi-4-reasoning-vision-15B Technical Report

Paper • 2603.03975 • Published Mar 4 • 20

upvoted 2 papers about 2 months ago

Visual Para-Thinker: Divide-and-Conquer Reasoning for Visual Comprehension

Paper • 2602.13310 • Published Feb 10 • 8

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published Feb 9 • 52

upvoted 4 papers 3 months ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 230

upvoted a paper 4 months ago

Latent Implicit Visual Reasoning

Paper • 2512.21218 • Published Dec 24, 2025 • 70

upvoted a collection 4 months ago

Molmo2 Data

Collection

Artifacts for the Molmo2 data release • 13 items • Updated Mar 2 • 39

upvoted 5 papers 4 months ago

HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices

Paper • 2512.14052 • Published Dec 16, 2025 • 42

Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published Dec 15, 2025 • 106

Qwen3-VL Technical Report

Paper • 2511.21631 • Published Nov 26, 2025 • 161

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published Dec 2, 2025 • 265

InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision

Paper • 2512.01342 • Published Dec 1, 2025 • 19

upvoted an article 4 months ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

Dec 1, 2025

•

307

upvoted 2 papers 5 months ago

HunyuanOCR Technical Report

Paper • 2511.19575 • Published Nov 24, 2025 • 22

Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data

Paper • 2511.12609 • Published Nov 16, 2025 • 106

Kaicheng Yang

AI & ML interests

Recent Activity

Organizations

Kaichengalex's activity

Transformers v5: Simple model definitions powering the AI ecosystem