-
The ATOM Report: Measuring the Open Language Model Ecosystem
Paper • 2604.07190 • Published • 5 -
Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem
Paper • 2512.03073 • Published • 7 -
Anatomy of a Machine Learning Ecosystem: 2 Million Models on Hugging Face
Paper • 2508.06811 • Published • 6 -
Bridging the Data Provenance Gap Across Text, Speech and Video
Paper • 2412.17847 • Published • 12
Collections
Discover the best community collections!
Collections including paper arxiv:2412.17847
-
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval
Paper • 2412.14475 • Published • 58 -
How to Synthesize Text Data without Model Collapse?
Paper • 2412.14689 • Published • 53 -
Token-Budget-Aware LLM Reasoning
Paper • 2412.18547 • Published • 46 -
WavePulse: Real-time Content Analytics of Radio Livestreams
Paper • 2412.17998 • Published • 11
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 30 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 15 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
ProcessBench: Identifying Process Errors in Mathematical Reasoning
Paper • 2412.06559 • Published • 86 -
Maya: An Instruction Finetuned Multilingual Multimodal Model
Paper • 2412.07112 • Published • 28 -
OpenAI o1 System Card
Paper • 2412.16720 • Published • 37 -
Diving into Self-Evolving Training for Multimodal Reasoning
Paper • 2412.17451 • Published • 42
-
Bridging the Data Provenance Gap Across Text, Speech and Video
Paper • 2412.17847 • Published • 12 -
Consent in Crisis: The Rapid Decline of the AI Data Commons
Paper • 2407.14933 • Published • 15 -
Data Authenticity, Consent, & Provenance for AI are all broken: what will it take to fix them?
Paper • 2404.12691 • Published • 2 -
The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI
Paper • 2310.16787 • Published • 5
-
The ATOM Report: Measuring the Open Language Model Ecosystem
Paper • 2604.07190 • Published • 5 -
Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem
Paper • 2512.03073 • Published • 7 -
Anatomy of a Machine Learning Ecosystem: 2 Million Models on Hugging Face
Paper • 2508.06811 • Published • 6 -
Bridging the Data Provenance Gap Across Text, Speech and Video
Paper • 2412.17847 • Published • 12
-
ProcessBench: Identifying Process Errors in Mathematical Reasoning
Paper • 2412.06559 • Published • 86 -
Maya: An Instruction Finetuned Multilingual Multimodal Model
Paper • 2412.07112 • Published • 28 -
OpenAI o1 System Card
Paper • 2412.16720 • Published • 37 -
Diving into Self-Evolving Training for Multimodal Reasoning
Paper • 2412.17451 • Published • 42
-
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval
Paper • 2412.14475 • Published • 58 -
How to Synthesize Text Data without Model Collapse?
Paper • 2412.14689 • Published • 53 -
Token-Budget-Aware LLM Reasoning
Paper • 2412.18547 • Published • 46 -
WavePulse: Real-time Content Analytics of Radio Livestreams
Paper • 2412.17998 • Published • 11
-
Bridging the Data Provenance Gap Across Text, Speech and Video
Paper • 2412.17847 • Published • 12 -
Consent in Crisis: The Rapid Decline of the AI Data Commons
Paper • 2407.14933 • Published • 15 -
Data Authenticity, Consent, & Provenance for AI are all broken: what will it take to fix them?
Paper • 2404.12691 • Published • 2 -
The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI
Paper • 2310.16787 • Published • 5
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 30 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 15 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23