gigastaufan 's Collections
CoRAG: Collaborative Retrieval-Augmented Generation
Paper
• 2504.01883
• Published • 9
SQL-R1: Training Natural Language to SQL Reasoning Model By
Reinforcement Learning
Paper
• 2504.08600
• Published • 33
Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards
for Reasoning-Enhanced Text-to-SQL
Paper
• 2503.23157
• Published • 10
AI Agents: Evolution, Architecture, and Real-World Applications
Paper
• 2503.12687
• Published • 2
OSUniverse: Benchmark for Multimodal GUI-navigation AI Agents
Paper
• 2505.03570
• Published • 8
J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning
Paper
• 2505.10320
• Published • 24
GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation
Paper
• 2502.01113
• Published • 6
From Local to Global: A Graph RAG Approach to Query-Focused
Summarization
Paper
• 2404.16130
• Published • 7
Large Language Models are Locally Linear Mappings
Paper
• 2505.24293
• Published • 14
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical
Understanding and Reasoning
Paper
• 2506.07044
• Published • 114
Paper
• 2506.10892
• Published • 37
VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement
Learning
Paper
• 2506.09049
• Published • 37
OmniGen2: Exploration to Advanced Multimodal Generation
Paper
• 2506.18871
• Published • 78
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image
Generation
Paper
• 2506.18095
• Published • 66
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable
Reinforcement Learning
Paper
• 2507.01006
• Published • 254
Does Math Reasoning Improve General LLM Capabilities? Understanding
Transferability of LLM Reasoning
Paper
• 2507.00432
• Published • 79
Fast and Simplex: 2-Simplicial Attention in Triton
Paper
• 2507.02754
• Published • 25
Coding Triangle: How Does Large Language Model Understand Code?
Paper
• 2507.06138
• Published • 22
KV Cache Steering for Inducing Reasoning in Small Language Models
Paper
• 2507.08799
• Published • 40
MUR: Momentum Uncertainty guided Reasoning for Large Language Models
Paper
• 2507.14958
• Published • 47
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
Paper
• 2508.01191
• Published • 240
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper
• 2508.03680
• Published • 140
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper
• 2508.06471
• Published • 211
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning
Paper
• 2508.08221
• Published • 50
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm
Bridging Foundation Models and Lifelong Agentic Systems
Paper
• 2508.07407
• Published • 99
Speed Always Wins: A Survey on Efficient Architectures for Large
Language Models
Paper
• 2508.09834
• Published • 53
Provable Benefits of In-Tool Learning for Large Language Models
Paper
• 2508.20755
• Published • 11
AWorld: Orchestrating the Training Recipe for Agentic AI
Paper
• 2508.20404
• Published • 38
Think in Games: Learning to Reason in Games via Reinforcement Learning
with Large Language Models
Paper
• 2508.21365
• Published • 29
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper
• 2509.02547
• Published • 238
LatticeWorld: A Multimodal Large Language Model-Empowered Framework for
Interactive Complex World Generation
Paper
• 2509.05263
• Published • 11
Revolutionizing Reinforcement Learning Framework for Diffusion Large
Language Models
Paper
• 2509.06949
• Published • 57
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn
Tool-Integrated Reasoning
Paper
• 2509.02479
• Published • 84
Lost in Embeddings: Information Loss in Vision-Language Models
Paper
• 2509.11986
• Published • 29
Regression Language Models for Code
Paper
• 2509.26476
• Published • 17
Multi-Agent Tool-Integrated Policy Optimization
Paper
• 2510.04678
• Published • 31
In-the-Flow Agentic System Optimization for Effective Planning and Tool
Use
Paper
• 2510.05592
• Published • 110
UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning
Paper
• 2510.13515
• Published • 12
RAG-Anything: All-in-One RAG Framework
Paper
• 2510.12323
• Published • 73
LLM-guided Hierarchical Retrieval
Paper
• 2510.13217
• Published • 21
Every Attention Matters: An Efficient Hybrid Architecture for
Long-Context Reasoning
Paper
• 2510.19338
• Published • 117
Guided Self-Evolving LLMs with Minimal Human Supervision
Paper
• 2512.02472
• Published • 55
Who's Your Judge? On the Detectability of LLM-Generated Judgments
Paper
• 2509.25154
• Published • 30
Code2World: A GUI World Model via Renderable Code Generation
Paper
• 2602.09856
• Published • 202
DARE: Aligning LLM Agents with the R Statistical Ecosystem via Distribution-Aware Retrieval
Paper
• 2603.04743
• Published • 53