README + BLOG: link all 6 LoRAs in the ablation list to their Hub repos 2947218 Running helloAK96 Claude Opus 4.7 commited on 13 days ago
README + BLOG: explicitly call out HF Jobs as our training infrastructure 1a6f7f1 helloAK96 Claude Opus 4.7 commited on 13 days ago
README: add Training History section β 3,200 episodes across 6 GRPO runs adbc390 helloAK96 Claude Opus 4.7 commited on 13 days ago
README: signpost the phase-wise judge demo notebook 7ea9030 helloAK96 Claude Opus 4.7 commited on 13 days ago
Promote Phase 3A LoRA β Qwen 3B beats heuristic on HARD, 100% rogue catch 90452ca helloAK96 Claude Opus 4.7 commited on 13 days ago
Promote Phase 2 LoRA (curriculum + LR=2e-5 + r=32) as the live trained lane f89a0e8 helloAK96 Claude Opus 4.7 commited on 13 days ago
README: real before/after numbers from the 540-episode evaluation 8878953 helloAK96 Claude Opus 4.7 commited on 14 days ago
README: add submission links, composable-rubric docs, plot embeds, package layout refresh 4ce0ada helloAK96 Claude Opus 4.7 commited on 14 days ago