docs: add multi-step training curves to README + BLOG_POST 125b737 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
README: use escaped \~ for single-tilde approximations 4a5fd24 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
docs: refresh BLOG_POST stale 87-task numbers; finish README ~ cleanup 802278c Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
Add training/TRAINING.md — end-to-end reproduction recipe 6206e8a Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
README: replace ~ with ≈ in intro to fix accidental strikethrough dec12b4 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
docs: add Prior Work section + replace Training/thinking-mode A/B with multi-step setup fc2c034 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
docs: surface multi-turn in How-it-works, defer training-process notes 4023371 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
docs: consolidate Results section — single master table + per-category + examples 31ce013 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
docs: add multi-step variant section to README + BLOG_POST e1e3cbe Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
README: add Scorers section (21 scorers grouped by family) 433bfad Don Rishabh commited on 12 days ago
docs: drop the misleading 37× compression anecdote (0-accuracy task) 9867aa7 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
README + BLOG: drop multi-step + Llama-self mentions (in-progress runs) c3e14ba Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
remove untested Colab notebook + link training/ folder in README a56bede Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
trackio: post-hoc replay of train_metrics.jsonl into a HF Space dashboard 3724e90 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
demo CSVs: add reward_advantage_vs_verbose + accuracy_delta_vs_verbose 7dafc94 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
README: rewrite for hackathon submission — links-first, plots inline, kill verbose sections a1b7a09 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
v3: multi-turn env, thinking tokens, cross-family Qwen->Llama, multi-step GRPO 67509ac Don Rishabh Claude Opus 4.7 (1M context) commited on 13 days ago
Initial commit: Prompt Golf environment for OpenEnv 6850dad Don Rishabh Claude Opus 4.7 (1M context) commited on 14 days ago