Collections
Discover the best community collections!
Collections including paper arxiv:2506.20920
-
Reasoning Introduces New Poisoning Attacks Yet Makes Them More Complicated
Paper • 2509.05739 • Published • 2 -
Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers
Paper • 2509.03059 • Published • 25 -
Universal Deep Research: Bring Your Own Model and Strategy
Paper • 2509.00244 • Published • 14 -
<think> So let's replace this phrase with insult... </think> Lessons learned from generation of toxic texts with LLMs
Paper • 2509.08358 • Published • 13
-
MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings
Paper • 2405.19504 • Published • 3 -
HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling
Paper • 2506.20452 • Published • 18 -
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 78 -
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm
Paper • 2507.18553 • Published • 41
-
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Paper • 2503.14734 • Published • 7 -
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
Paper • 2401.02117 • Published • 33 -
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics
Paper • 2506.01844 • Published • 158 -
Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding
Paper • 2506.16035 • Published • 89
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 78 -
HuggingFaceFW/finewiki
Viewer • Updated • 61.6M • 6.73k • 292 -
nhagar/fineweb_urls
Viewer • Updated • 24.5B • 1.18k • 2 -
PleIAs/common_corpus
Viewer • Updated • 69.9k • 185k • 390
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 78 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 207 -
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Paper • 2303.03915 • Published • 7 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 258
-
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation
Paper • 2506.18095 • Published • 66 -
FreedomIntelligence/ShareGPT-4o-Image
Viewer • Updated • 92.3k • 652 • 97 -
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 78
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 78 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 207 -
YourBench: Easy Custom Evaluation Sets for Everyone
Paper • 2504.01833 • Published • 23 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 258
-
The Curse of Depth in Large Language Models
Paper • 2502.05795 • Published • 40 -
Transformers without Normalization
Paper • 2503.10622 • Published • 172 -
Parallel Scaling Law for Language Models
Paper • 2505.10475 • Published • 83 -
Learning to Skip the Middle Layers of Transformers
Paper • 2506.21103 • Published • 18
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 78 -
HuggingFaceFW/finewiki
Viewer • Updated • 61.6M • 6.73k • 292 -
nhagar/fineweb_urls
Viewer • Updated • 24.5B • 1.18k • 2 -
PleIAs/common_corpus
Viewer • Updated • 69.9k • 185k • 390
-
Reasoning Introduces New Poisoning Attacks Yet Makes Them More Complicated
Paper • 2509.05739 • Published • 2 -
Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers
Paper • 2509.03059 • Published • 25 -
Universal Deep Research: Bring Your Own Model and Strategy
Paper • 2509.00244 • Published • 14 -
<think> So let's replace this phrase with insult... </think> Lessons learned from generation of toxic texts with LLMs
Paper • 2509.08358 • Published • 13
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 78 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 207 -
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Paper • 2303.03915 • Published • 7 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 258
-
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation
Paper • 2506.18095 • Published • 66 -
FreedomIntelligence/ShareGPT-4o-Image
Viewer • Updated • 92.3k • 652 • 97 -
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 78
-
MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings
Paper • 2405.19504 • Published • 3 -
HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling
Paper • 2506.20452 • Published • 18 -
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 78 -
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm
Paper • 2507.18553 • Published • 41
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 78 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 207 -
YourBench: Easy Custom Evaluation Sets for Everyone
Paper • 2504.01833 • Published • 23 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 258
-
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Paper • 2503.14734 • Published • 7 -
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
Paper • 2401.02117 • Published • 33 -
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics
Paper • 2506.01844 • Published • 158 -
Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding
Paper • 2506.16035 • Published • 89
-
The Curse of Depth in Large Language Models
Paper • 2502.05795 • Published • 40 -
Transformers without Normalization
Paper • 2503.10622 • Published • 172 -
Parallel Scaling Law for Language Models
Paper • 2505.10475 • Published • 83 -
Learning to Skip the Middle Layers of Transformers
Paper • 2506.21103 • Published • 18