Sprint 10A: shadow_eval.py — compare candidate vs baseline safely bc30484 verified Rohan03 commited on 13 days ago