Collections
Discover the best community collections!
Collections including paper arxiv:2603.12572
-
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 204 -
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence
Paper • 2511.18538 • Published • 304 -
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 277 -
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger
Paper • 2602.08222 • Published • 290
-
Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just Like an Olympiad Team
Paper • 2506.14234 • Published • 41 -
MoTE: Mixture of Ternary Experts for Memory-efficient Large Multimodal Models
Paper • 2506.14435 • Published • 7 -
Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory
Paper • 2504.19413 • Published • 52 -
MemOS: A Memory OS for AI System
Paper • 2507.03724 • Published • 166
-
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
Paper • 2408.14906 • Published • 144 -
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Paper • 2410.10819 • Published • 7 -
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models
Paper • 2410.09342 • Published • 39 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 55
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 107 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 80 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 45
-
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise
Paper • 2602.12783 • Published • 216 -
MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios
Paper • 2602.22638 • Published • 107 -
CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World Uncertainty
Paper • 2601.22027 • Published • 85 -
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development
Paper • 2601.11077 • Published • 67
-
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
Paper • 2505.13227 • Published • 45 -
facebook/natural_reasoning
Viewer • Updated • 1.15M • 1.42k • 562 -
nvidia/OpenMathReasoning
Viewer • Updated • 5.68M • 17.6k • 453 -
Search Arena: Analyzing Search-Augmented LLMs
Paper • 2506.05334 • Published • 18
-
Beyond Chain-of-Thought: A Survey of Chain-of-X Paradigms for LLMs
Paper • 2404.15676 • Published -
How faithful are RAG models? Quantifying the tug-of-war between RAG and LLMs' internal prior
Paper • 2404.10198 • Published • 8 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 72 -
FaaF: Facts as a Function for the evaluation of RAG systems
Paper • 2403.03888 • Published
-
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise
Paper • 2602.12783 • Published • 216 -
MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios
Paper • 2602.22638 • Published • 107 -
CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World Uncertainty
Paper • 2601.22027 • Published • 85 -
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development
Paper • 2601.11077 • Published • 67
-
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 204 -
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence
Paper • 2511.18538 • Published • 304 -
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 277 -
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger
Paper • 2602.08222 • Published • 290
-
Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just Like an Olympiad Team
Paper • 2506.14234 • Published • 41 -
MoTE: Mixture of Ternary Experts for Memory-efficient Large Multimodal Models
Paper • 2506.14435 • Published • 7 -
Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory
Paper • 2504.19413 • Published • 52 -
MemOS: A Memory OS for AI System
Paper • 2507.03724 • Published • 166
-
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
Paper • 2505.13227 • Published • 45 -
facebook/natural_reasoning
Viewer • Updated • 1.15M • 1.42k • 562 -
nvidia/OpenMathReasoning
Viewer • Updated • 5.68M • 17.6k • 453 -
Search Arena: Analyzing Search-Augmented LLMs
Paper • 2506.05334 • Published • 18
-
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
Paper • 2408.14906 • Published • 144 -
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Paper • 2410.10819 • Published • 7 -
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models
Paper • 2410.09342 • Published • 39 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 55
-
Beyond Chain-of-Thought: A Survey of Chain-of-X Paradigms for LLMs
Paper • 2404.15676 • Published -
How faithful are RAG models? Quantifying the tug-of-war between RAG and LLMs' internal prior
Paper • 2404.10198 • Published • 8 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 72 -
FaaF: Facts as a Function for the evaluation of RAG systems
Paper • 2403.03888 • Published
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 107 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 80 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 45