SecureAgentBench: Benchmarking Secure Code Generation under Realistic Vulnerability Scenarios Paper • 2509.22097 • Published Sep 26, 2025 • 1
Reasoning Runtime Behavior of a Program with LLM: How Far Are We? Paper • 2403.16437 • Published Mar 25, 2024 • 2