InCoder-32B-Thinking: Industrial Code World Model for Thinking Paper • 2604.03144 • Published 12 days ago • 229
T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning Paper • 2603.03790 • Published Mar 4 • 121
QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining Paper • 2602.07085 • Published Feb 6 • 190
Spider-Sense: Intrinsic Risk Sensing for Efficient Agent Defense with Hierarchical Adaptive Screening Paper • 2602.05386 • Published Feb 5 • 69
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding Paper • 2602.01785 • Published Feb 2 • 96
Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives Paper • 2601.20833 • Published Jan 28 • 183
DramaBench: A Six-Dimensional Evaluation Framework for Drama Script Continuation Paper • 2512.19012 • Published Dec 22, 2025 • 17
Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows Paper • 2512.13168 • Published Dec 15, 2025 • 53
Sherlock: Self-Correcting Reasoning in Vision-Language Models Paper • 2505.22651 • Published May 28, 2025 • 48
SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models Paper • 2503.07605 • Published Mar 10, 2025 • 66
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback Paper • 2406.00888 • Published Jun 2, 2024 • 33
Learning Temporally Consistent Video Depth from Video Diffusion Priors Paper • 2406.01493 • Published Jun 3, 2024 • 23