Relocate training notebooks, add BLOG and Google Colab links (SFT + GRPO HF Job), dashboard updates, and eval artifacts 00a2188 sh4shv4t commited on 12 days ago
Add GRPO HF job reward/loss curves, dashboard wiring, plot script, and fix grpo_train log_history unwrap bf9f882 sh4shv4t commited on 12 days ago
Add OpenEnv client, compat layer, manifest, scripts, GRPO plot hook, and README 81b4b70 sh4shv4t commited on 12 days ago
fix(sft): TRL 1.0+ uses max_length in SFTConfig, not max_seq_length 63e14b4 sh4shv4t commited on 13 days ago
Add pre-training audit scripts, OpenEnv manifest, and tune Parlay training/env (GRPO 1.5B default, min-reward filters, weighted data gen, hiring ZOPA+drift, veteran/opponent prompts, Docker/docs) df724f2 sh4shv4t commited on 13 days ago
fix: upgrade gemini model string to 2.5-flash-lite + add tom diagnostic script 3f61551 sh4shv4t commited on 13 days ago
feat: streamline parlay for demo mode and add spectator negotiation mechanics 2568517 sh4shv4t commited on 13 days ago
feat: split Gemini 2.5 Flash (demo) and Flash-Lite (data), SFT threshold 0.3, favicon + check_gemini 9d82eed sh4shv4t commited on 14 days ago
build: add Windows PowerShell setup scripts and fix venv paths for Windows development 7183e08 sh4shv4t commited on 16 days ago