Refactor grader to use openenv.core.rubrics.WeightedSum + Rubric subclasses f0ca22d InosLihka commited on 11 days ago
Fix max_new_tokens for CoT format + add eval-only HF Jobs script b9c9b8f InosLihka commited on 12 days ago
Algorithm Distillation: grader v2 with belief_accuracy + SFT pipeline ece0bbe InosLihka commited on 12 days ago
docs: iteration journal with hypothesis/result/root-cause/fix per iter e12fc69 InosLihka Claude Opus 4.7 (1M context) commited on 12 days ago