-
Attention Is All You Need
Paper • 1706.03762 • Published • 121 -
Scaling Laws for Neural Language Models
Paper • 2001.08361 • Published • 10 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT
Paper • 2210.04186 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2510.11696
-
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs
Paper • 2510.11696 • Published • 182 -
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats
Paper • 2510.25602 • Published • 80 -
6Bit-Diffusion: Inference-Time Mixed-Precision Quantization for Video Diffusion Models
Paper • 2603.18742 • Published • 10
-
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs
Paper • 2510.11696 • Published • 182 -
Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation
Paper • 2510.21583 • Published • 31 -
Sparser Block-Sparse Attention via Token Permutation
Paper • 2510.21270 • Published • 25
-
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs
Paper • 2510.11696 • Published • 182 -
Does Your Reasoning Model Implicitly Know When to Stop Thinking?
Paper • 2602.08354 • Published • 263 -
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use
Paper • 2603.03205 • Published • 13 -
π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs
Paper • 2603.02083 • Published • 9
-
Demystifying Reinforcement Learning in Agentic Reasoning
Paper • 2510.11701 • Published • 33 -
Self-Improving LLM Agents at Test-Time
Paper • 2510.07841 • Published • 10 -
Making Mathematical Reasoning Adaptive
Paper • 2510.04617 • Published • 23 -
DocReward: A Document Reward Model for Structuring and Stylizing
Paper • 2510.11391 • Published • 27
-
Attention Is All You Need
Paper • 1706.03762 • Published • 121 -
Scaling Laws for Neural Language Models
Paper • 2001.08361 • Published • 10 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT
Paper • 2210.04186 • Published
-
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs
Paper • 2510.11696 • Published • 182 -
Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation
Paper • 2510.21583 • Published • 31 -
Sparser Block-Sparse Attention via Token Permutation
Paper • 2510.21270 • Published • 25
-
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs
Paper • 2510.11696 • Published • 182 -
Does Your Reasoning Model Implicitly Know When to Stop Thinking?
Paper • 2602.08354 • Published • 263 -
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use
Paper • 2603.03205 • Published • 13 -
π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs
Paper • 2603.02083 • Published • 9
-
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs
Paper • 2510.11696 • Published • 182 -
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats
Paper • 2510.25602 • Published • 80 -
6Bit-Diffusion: Inference-Time Mixed-Precision Quantization for Video Diffusion Models
Paper • 2603.18742 • Published • 10
-
Demystifying Reinforcement Learning in Agentic Reasoning
Paper • 2510.11701 • Published • 33 -
Self-Improving LLM Agents at Test-Time
Paper • 2510.07841 • Published • 10 -
Making Mathematical Reasoning Adaptive
Paper • 2510.04617 • Published • 23 -
DocReward: A Document Reward Model for Structuring and Stylizing
Paper • 2510.11391 • Published • 27