Collections
Discover the best community collections!
Collections including paper arxiv:2410.02712
-
MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities
Paper • 2408.00765 • Published • 13 -
Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent
Paper • 2407.21646 • Published • 18 -
LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
Paper • 2408.04284 • Published • 25 -
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Paper • 2408.07852 • Published • 16
-
iVideoGPT: Interactive VideoGPTs are Scalable World Models
Paper • 2405.15223 • Published • 17 -
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Paper • 2405.15574 • Published • 55 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 90 -
Matryoshka Multimodal Models
Paper • 2405.17430 • Published • 34
-
Large Language Models as Markov Chains
Paper • 2410.02724 • Published • 33 -
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Paper • 2410.02757 • Published • 36 -
LLaVA-Critic: Learning to Evaluate Multimodal Models
Paper • 2410.02712 • Published • 37 -
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
Paper • 2410.02367 • Published • 50
-
RLHF Workflow: From Reward Modeling to Online RLHF
Paper • 2405.07863 • Published • 71 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 134 -
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Paper • 2405.15574 • Published • 55 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 90
-
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 129 -
Evolutionary Optimization of Model Merging Recipes
Paper • 2403.13187 • Published • 58 -
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model
Paper • 2402.03766 • Published • 15 -
LLM Agent Operating System
Paper • 2403.16971 • Published • 73
-
Large Language Models as Markov Chains
Paper • 2410.02724 • Published • 33 -
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Paper • 2410.02757 • Published • 36 -
LLaVA-Critic: Learning to Evaluate Multimodal Models
Paper • 2410.02712 • Published • 37 -
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
Paper • 2410.02367 • Published • 50
-
MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities
Paper • 2408.00765 • Published • 13 -
Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent
Paper • 2407.21646 • Published • 18 -
LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
Paper • 2408.04284 • Published • 25 -
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Paper • 2408.07852 • Published • 16
-
RLHF Workflow: From Reward Modeling to Online RLHF
Paper • 2405.07863 • Published • 71 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 134 -
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Paper • 2405.15574 • Published • 55 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 90
-
iVideoGPT: Interactive VideoGPTs are Scalable World Models
Paper • 2405.15223 • Published • 17 -
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Paper • 2405.15574 • Published • 55 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 90 -
Matryoshka Multimodal Models
Paper • 2405.17430 • Published • 34
-
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 129 -
Evolutionary Optimization of Model Merging Recipes
Paper • 2403.13187 • Published • 58 -
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model
Paper • 2402.03766 • Published • 15 -
LLM Agent Operating System
Paper • 2403.16971 • Published • 73