Paper - a gigastaufan Collection

gigastaufan 's Collections

Paper

updated Mar 7

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2, 2025 • 9
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

Paper • 2504.08600 • Published Apr 11, 2025 • 33
Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL

Paper • 2503.23157 • Published Mar 29, 2025 • 10
AI Agents: Evolution, Architecture, and Real-World Applications

Paper • 2503.12687 • Published Mar 16, 2025 • 2
OSUniverse: Benchmark for Multimodal GUI-navigation AI Agents

Paper • 2505.03570 • Published May 6, 2025 • 8
J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Paper • 2505.10320 • Published May 15, 2025 • 24
GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation

Paper • 2502.01113 • Published Feb 3, 2025 • 6
From Local to Global: A Graph RAG Approach to Query-Focused Summarization

Paper • 2404.16130 • Published Apr 24, 2024 • 7
Large Language Models are Locally Linear Mappings

Paper • 2505.24293 • Published May 30, 2025 • 14
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning

Paper • 2506.07044 • Published Jun 8, 2025 • 114
The Diffusion Duality

Paper • 2506.10892 • Published Jun 12, 2025 • 37
VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

Paper • 2506.09049 • Published Jun 10, 2025 • 37
OmniGen2: Exploration to Advanced Multimodal Generation

Paper • 2506.18871 • Published Jun 23, 2025 • 78
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation

Paper • 2506.18095 • Published Jun 22, 2025 • 66
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published Jul 1, 2025 • 254
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1, 2025 • 79
Fast and Simplex: 2-Simplicial Attention in Triton

Paper • 2507.02754 • Published Jul 3, 2025 • 25
Coding Triangle: How Does Large Language Model Understand Code?

Paper • 2507.06138 • Published Jul 8, 2025 • 22
KV Cache Steering for Inducing Reasoning in Small Language Models

Paper • 2507.08799 • Published Jul 11, 2025 • 40
MUR: Momentum Uncertainty guided Reasoning for Large Language Models

Paper • 2507.14958 • Published Jul 20, 2025 • 47
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Paper • 2508.01191 • Published Aug 2, 2025 • 240
Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Paper • 2508.03680 • Published Aug 5, 2025 • 140
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8, 2025 • 211
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Paper • 2508.08221 • Published Aug 11, 2025 • 50
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

Paper • 2508.07407 • Published Aug 10, 2025 • 99
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

Paper • 2508.09834 • Published Aug 13, 2025 • 53
Provable Benefits of In-Tool Learning for Large Language Models

Paper • 2508.20755 • Published Aug 28, 2025 • 11
AWorld: Orchestrating the Training Recipe for Agentic AI

Paper • 2508.20404 • Published Aug 28, 2025 • 38
Think in Games: Learning to Reason in Games via Reinforcement Learning with Large Language Models

Paper • 2508.21365 • Published Aug 29, 2025 • 29
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 238
LatticeWorld: A Multimodal Large Language Model-Empowered Framework for Interactive Complex World Generation

Paper • 2509.05263 • Published Sep 5, 2025 • 11
Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models

Paper • 2509.06949 • Published Sep 8, 2025 • 57
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2, 2025 • 84
Lost in Embeddings: Information Loss in Vision-Language Models

Paper • 2509.11986 • Published Sep 15, 2025 • 29
Regression Language Models for Code

Paper • 2509.26476 • Published Sep 30, 2025 • 17
Multi-Agent Tool-Integrated Policy Optimization

Paper • 2510.04678 • Published Oct 6, 2025 • 31
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7, 2025 • 110
UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning

Paper • 2510.13515 • Published Oct 15, 2025 • 12
RAG-Anything: All-in-One RAG Framework

Paper • 2510.12323 • Published Oct 14, 2025 • 73
LLM-guided Hierarchical Retrieval

Paper • 2510.13217 • Published Oct 15, 2025 • 21
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning

Paper • 2510.19338 • Published Oct 22, 2025 • 117
Guided Self-Evolving LLMs with Minimal Human Supervision

Paper • 2512.02472 • Published Dec 2, 2025 • 55
Who's Your Judge? On the Detectability of LLM-Generated Judgments

Paper • 2509.25154 • Published Sep 29, 2025 • 30
Code2World: A GUI World Model via Renderable Code Generation

Paper • 2602.09856 • Published Feb 10 • 202
DARE: Aligning LLM Agents with the R Statistical Ecosystem via Distribution-Aware Retrieval

Paper • 2603.04743 • Published Mar 5 • 53