Collections
Discover the best community collections!
Collections including paper arxiv:2603.04791
-
Beyond Language Modeling: An Exploration of Multimodal Pretraining
Paper • 2603.03276 • Published • 103 -
Qwen3-Coder-Next Technical Report
Paper • 2603.00729 • Published • 64 -
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use
Paper • 2603.03205 • Published • 13 -
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios
Paper • 2602.23166 • Published • 45
-
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper • 2601.03233 • Published • 176 -
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head
Paper • 2601.07832 • Published • 52 -
Motion Attribution for Video Generation
Paper • 2601.08828 • Published • 72 -
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep
Paper • 2601.19895 • Published • 27
-
TSGym: Design Choices for Deep Multivariate Time-Series Forecasting
Paper • 2509.17063 • Published -
Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling
Paper • 2603.04791 • Published • 20 -
Small but Mighty: Enhancing Time Series Forecasting with Lightweight LLMs
Paper • 2503.03594 • Published -
TSRBench: A Comprehensive Multi-task Multi-modal Time Series Reasoning Benchmark for Generalist Models
Paper • 2601.18744 • Published • 10
-
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation
Paper • 2601.15369 • Published • 21 -
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model
Paper • 2601.15892 • Published • 53 -
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders
Paper • 2601.16208 • Published • 55 -
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems
Paper • 2601.11004 • Published • 30
-
MMGR: Multi-Modal Generative Reasoning
Paper • 2512.14691 • Published • 121 -
KlingAvatar 2.0 Technical Report
Paper • 2512.13313 • Published • 44 -
SemanticGen: Video Generation in Semantic Space
Paper • 2512.20619 • Published • 95 -
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
Paper • 2512.16676 • Published • 222
-
TSGym: Design Choices for Deep Multivariate Time-Series Forecasting
Paper • 2509.17063 • Published -
Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling
Paper • 2603.04791 • Published • 20 -
Small but Mighty: Enhancing Time Series Forecasting with Lightweight LLMs
Paper • 2503.03594 • Published -
TSRBench: A Comprehensive Multi-task Multi-modal Time Series Reasoning Benchmark for Generalist Models
Paper • 2601.18744 • Published • 10
-
Beyond Language Modeling: An Exploration of Multimodal Pretraining
Paper • 2603.03276 • Published • 103 -
Qwen3-Coder-Next Technical Report
Paper • 2603.00729 • Published • 64 -
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use
Paper • 2603.03205 • Published • 13 -
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios
Paper • 2602.23166 • Published • 45
-
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation
Paper • 2601.15369 • Published • 21 -
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model
Paper • 2601.15892 • Published • 53 -
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders
Paper • 2601.16208 • Published • 55 -
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems
Paper • 2601.11004 • Published • 30
-
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper • 2601.03233 • Published • 176 -
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head
Paper • 2601.07832 • Published • 52 -
Motion Attribution for Video Generation
Paper • 2601.08828 • Published • 72 -
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep
Paper • 2601.19895 • Published • 27
-
MMGR: Multi-Modal Generative Reasoning
Paper • 2512.14691 • Published • 121 -
KlingAvatar 2.0 Technical Report
Paper • 2512.13313 • Published • 44 -
SemanticGen: Video Generation in Semantic Space
Paper • 2512.20619 • Published • 95 -
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
Paper • 2512.16676 • Published • 222