feat: implement dataset loader, environment, and GRPO training pipeline for undertrial bail prediction bf8f1ff Draken1606 commited on 14 days ago
Fix 8 compliance gaps: repeat-action dedup+cache, min-steps hard block, criminal history tool (12th action), efficiency removed from training formula, circular import cleaned, yaml formula synced 898bc18 Draken1606 commited on 14 days ago
Add 4 missing actions: read_submissions, assess_flight_risk, check_case_factors, apply_proportionality (fixes 4.3d/e/g/h/i) ce6728e Draken1606 commited on 15 days ago
Fix all audit gaps: custody neutral, parity-first bias, skip penalty 0.40, statutory process reward, /observation endpoint, reset() timeout, drift determinism 2bc545f Draken1606 commited on 15 days ago
Fix 5 audit gaps: conditional bail, action history, efficiency reward, train/val split, env API routing 6218d9a Draken1606 commited on 15 days ago