Tighten README: resolve GRPO contradiction, drop duplicate baseline table, remove internal mentor docs 0503beb InosLihka commited on 9 days ago
Algorithm Distillation: grader v2 with belief_accuracy + SFT pipeline ece0bbe InosLihka commited on 12 days ago
docs: add sim-to-real deployment architecture reference 24adee5 InosLihka Claude Sonnet 4.6 commited on 13 days ago
docs: reorganize — 25 files → 4 focused docs 1a25a1a InosLihka Claude Sonnet 4.6 commited on 13 days ago
Rebuild as Life Simulator: 5 meters, 3 hidden profiles, GRPO training pipeline cc6473a InosLihka Claude Sonnet 4.6 commited on 13 days ago