JacobHicks 's Collections read later
updated
Sharing is Caring: Efficient LM Post-Training with Collective RL
Experience Sharing
Paper
• 2509.08721
• Published • 665
A.S.E: A Repository-Level Benchmark for Evaluating Security in
AI-Generated Code
Paper
• 2508.18106
• Published • 350
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action
Model
Paper
• 2509.09372
• Published • 254
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper
• 2509.02547
• Published • 238
A Survey of Reinforcement Learning for Large Reasoning Models
Paper
• 2509.08827
• Published • 193
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper
• 2509.03867
• Published • 213
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper
• 2508.05748
• Published • 142
ReSum: Unlocking Long-Horizon Search Intelligence via Context
Summarization
Paper
• 2509.13313
• Published • 80
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic
Data and Scalable Reinforcement Learning
Paper
• 2509.13305
• Published • 91
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement
Learning
Paper
• 2509.22647
• Published • 34
Scaling Agents via Continual Pre-training
Paper
• 2509.13310
• Published • 117
PaddleOCR 3.0 Technical Report
Paper
• 2507.05595
• Published • 22
Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI
Agents
Paper
• 2509.06917
• Published • 44
AgentScope 1.0: A Developer-Centric Framework for Building Agentic
Applications
Paper
• 2508.16279
• Published • 61
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper
• 2403.13372
• Published • 183