KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance Paper • 2604.12627 • Published 3 days ago • 92
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook Paper • 2604.02029 • Published 15 days ago • 140
EpochX: Building the Infrastructure for an Emergent Agent Civilization Paper • 2603.27304 • Published 19 days ago • 47
DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories Paper • 2602.10809 • Published Feb 11 • 59
QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining Paper • 2602.07085 • Published Feb 6 • 190
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding Paper • 2602.01785 • Published Feb 2 • 96
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper • 2601.06943 • Published Jan 11 • 216
DramaBench: A Six-Dimensional Evaluation Framework for Drama Script Continuation Paper • 2512.19012 • Published Dec 22, 2025 • 17
Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows Paper • 2512.13168 • Published Dec 15, 2025 • 53
The Consistency Critic: Correcting Inconsistencies in Generated Images via Reference-Guided Attentive Alignment Paper • 2511.20614 • Published Nov 25, 2025 • 38
PICABench: How Far Are We from Physically Realistic Image Editing? Paper • 2510.17681 • Published Oct 20, 2025 • 65
StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs Paper • 2509.22220 • Published Sep 26, 2025 • 66
SWE-QA: Can Language Models Answer Repository-level Code Questions? Paper • 2509.14635 • Published Sep 18, 2025 • 35
RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation Paper • 2509.16198 • Published Sep 19, 2025 • 129
MachineLearningLM: Continued Pretraining Language Models on Millions of Synthetic Tabular Prediction Tasks Scales In-Context ML Paper • 2509.06806 • Published Sep 8, 2025 • 63
Table-R1: Inference-Time Scaling for Table Reasoning Paper • 2505.23621 • Published May 29, 2025 • 93
Sherlock: Self-Correcting Reasoning in Vision-Language Models Paper • 2505.22651 • Published May 28, 2025 • 48
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data Paper • 2505.18445 • Published May 24, 2025 • 63
PixelHacker: Image Inpainting with Structural and Semantic Consistency Paper • 2504.20438 • Published Apr 29, 2025 • 44