test: add Level 3 adversarial pressure integration tests e83d409 unverified Jayant-Kernel Claude Sonnet 4.6 commited on 13 days ago
test: add Level 2 integration tests (test_level2.py) 3380d3c unverified Jayant-Kernel Claude Sonnet 4.6 commited on 13 days ago
feat: add 429 retry wrapper to grader semantic check b44d7b0 unverified Jayant-Kernel Claude Sonnet 4.6 commited on 13 days ago
Phase 2.5: multi-turn episodes, bug fixes, dataset cleanup 9737348 unverified Jayant-Kernel Claude Sonnet 4.6 commited on 14 days ago
Phase 2 complete: Level 1 env runs locally, tests green, 100-question dataset f577d1f unverified Jayant-Kernel Claude Sonnet 4.6 commited on 14 days ago
Phase 1 complete: schemas, reward design, project scaffold 139d3d1 unverified Jayant-Kernel Claude Sonnet 4.6 commited on 14 days ago