The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping Paper • 2604.11297 • Published 3 days ago • 91
QuanBench+: A Unified Multi-Framework Benchmark for LLM-Based Quantum Code Generation Paper • 2604.08570 • Published 22 days ago • 120
OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks Paper • 2604.08539 • Published 7 days ago • 48
ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published 7 days ago • 254
MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping Paper • 2604.08364 • Published 7 days ago • 95
Automating Database-Native Function Code Synthesis with LLMs Paper • 2604.06231 • Published 14 days ago • 17
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 8 days ago • 310
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published 7 days ago • 276
Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning Paper • 2604.04746 • Published 8 days ago • 70
GPT-1900 Collection Pre-1900 LLMs for physics reasoning. RL models are physics-only; use the SFT model for general chat. Tune temperature (0.6-0.7). • 11 items • Updated 13 days ago • 6
MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome Paper • 2603.28407 • Published 16 days ago • 68
ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers Paper • 2603.24414 • Published 21 days ago • 183
LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels Paper • 2603.19312 • Published Mar 13 • 27
LeWM Collection Official checkpoints and datasets related to LeWM paper. • 9 items • Updated 19 days ago • 23