intent-translation-training / evaluate_v2.py

Commit History

Fix: flush stdout for nohup, log every sample, add timestamps
f34fb3a
verified

nraptisss commited on

Add evaluate_v2.py — standard-aware KPI checking (fixes 92% false negatives in reliability metric)
f1d77cf
verified

nraptisss commited on