-
BitNet Distillation
Paper • 2510.13998 • Published • 59 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 107 -
BitNet b1.58 2B4T Technical Report
Paper • 2504.12285 • Published • 85 -
Bitnet.cpp: Efficient Edge Inference for Ternary LLMs
Paper • 2502.11880 • Published • 17
Collections
Discover the best community collections!
Collections including paper arxiv:2510.13998
-
Your Group-Relative Advantage Is Biased
Paper • 2601.08521 • Published • 158 -
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 140 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 322 -
BitNet Distillation
Paper • 2510.13998 • Published • 59
-
The Art of Scaling Reinforcement Learning Compute for LLMs
Paper • 2510.13786 • Published • 33 -
Attention Is All You Need for KV Cache in Diffusion LLMs
Paper • 2510.14973 • Published • 42 -
BitNet Distillation
Paper • 2510.13998 • Published • 59 -
GigaBrain-0: A World Model-Powered Vision-Language-Action Model
Paper • 2510.19430 • Published • 53
-
TradingAgents: Multi-Agents LLM Financial Trading Framework
Paper • 2412.20138 • Published • 47 -
MinerU: An Open-Source Solution for Precise Document Content Extraction
Paper • 2409.18839 • Published • 41 -
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
Paper • 2509.22186 • Published • 160 -
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 140
-
Paper2Web: Let's Make Your Paper Alive!
Paper • 2510.15842 • Published • 27 -
Paper2Video: Automatic Video Generation from Scientific Papers
Paper • 2510.05096 • Published • 120 -
BitNet Distillation
Paper • 2510.13998 • Published • 59 -
Visual Backdoor Attacks on MLLM Embodied Decision Making via Contrastive Trigger Learning
Paper • 2510.27623 • Published • 13
-
SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights
Paper • 2509.22944 • Published • 81 -
Robot Learning: A Tutorial
Paper • 2510.12403 • Published • 130 -
UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE
Paper • 2510.13344 • Published • 64 -
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding
Paper • 2510.06308 • Published • 55
-
BitNet Distillation
Paper • 2510.13998 • Published • 59 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 107 -
BitNet b1.58 2B4T Technical Report
Paper • 2504.12285 • Published • 85 -
Bitnet.cpp: Efficient Edge Inference for Ternary LLMs
Paper • 2502.11880 • Published • 17
-
Your Group-Relative Advantage Is Biased
Paper • 2601.08521 • Published • 158 -
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 140 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 322 -
BitNet Distillation
Paper • 2510.13998 • Published • 59
-
TradingAgents: Multi-Agents LLM Financial Trading Framework
Paper • 2412.20138 • Published • 47 -
MinerU: An Open-Source Solution for Precise Document Content Extraction
Paper • 2409.18839 • Published • 41 -
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
Paper • 2509.22186 • Published • 160 -
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 140
-
Paper2Web: Let's Make Your Paper Alive!
Paper • 2510.15842 • Published • 27 -
Paper2Video: Automatic Video Generation from Scientific Papers
Paper • 2510.05096 • Published • 120 -
BitNet Distillation
Paper • 2510.13998 • Published • 59 -
Visual Backdoor Attacks on MLLM Embodied Decision Making via Contrastive Trigger Learning
Paper • 2510.27623 • Published • 13
-
The Art of Scaling Reinforcement Learning Compute for LLMs
Paper • 2510.13786 • Published • 33 -
Attention Is All You Need for KV Cache in Diffusion LLMs
Paper • 2510.14973 • Published • 42 -
BitNet Distillation
Paper • 2510.13998 • Published • 59 -
GigaBrain-0: A World Model-Powered Vision-Language-Action Model
Paper • 2510.19430 • Published • 53
-
SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights
Paper • 2509.22944 • Published • 81 -
Robot Learning: A Tutorial
Paper • 2510.12403 • Published • 130 -
UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE
Paper • 2510.13344 • Published • 64 -
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding
Paper • 2510.06308 • Published • 55