Spaces:

rishabh16196
/

prompt_golf_env

Sleeping

App Files Files Community

prompt_golf_env / README.md

Commit History

docs: add multi-step training curves to README + BLOG_POST

125b737

Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago

README: use escaped \~ for single-tilde approximations

4a5fd24

Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago

docs: refresh BLOG_POST stale 87-task numbers; finish README ~ cleanup

802278c

Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago

Add training/TRAINING.md — end-to-end reproduction recipe

6206e8a

Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago

README: replace ~ with ≈ in intro to fix accidental strikethrough

dec12b4

Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago

README: stronger intro

8a2a589

Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago

docs: add Prior Work section + replace Training/thinking-mode A/B with multi-step setup

fc2c034

Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago

docs: surface multi-turn in How-it-works, defer training-process notes

4023371

Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago

docs: consolidate Results section — single master table + per-category + examples

31ce013

Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago

docs: add multi-step variant section to README + BLOG_POST

e1e3cbe

Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago

README: add Scorers section (21 scorers grouped by family)

433bfad

Don Rishabh commited on 12 days ago

docs: drop the misleading 37× compression anecdote (0-accuracy task)

9867aa7

Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago

README + BLOG: drop multi-step + Llama-self mentions (in-progress runs)

c3e14ba

Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago

remove untested Colab notebook + link training/ folder in README

a56bede

Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago

trackio: post-hoc replay of train_metrics.jsonl into a HF Space dashboard

3724e90

Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago

demo CSVs: add reward_advantage_vs_verbose + accuracy_delta_vs_verbose

7dafc94

Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago

README: rewrite for hackathon submission — links-first, plots inline, kill verbose sections

a1b7a09

Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago

v3: multi-turn env, thinking tokens, cross-family Qwen->Llama, multi-step GRPO

67509ac

Don Rishabh Claude Opus 4.7 (1M context) commited on 13 days ago

Initial commit: Prompt Golf environment for OpenEnv

6850dad

Don Rishabh Claude Opus 4.7 (1M context) commited on 14 days ago

Commit History

docs: add multi-step training curves to README + BLOG_POST 125b737

README: use escaped \~ for single-tilde approximations 4a5fd24

docs: refresh BLOG_POST stale 87-task numbers; finish README ~ cleanup 802278c

Add training/TRAINING.md — end-to-end reproduction recipe 6206e8a

README: replace ~ with ≈ in intro to fix accidental strikethrough dec12b4

README: stronger intro 8a2a589

docs: add Prior Work section + replace Training/thinking-mode A/B with multi-step setup fc2c034

docs: surface multi-turn in How-it-works, defer training-process notes 4023371

docs: consolidate Results section — single master table + per-category + examples 31ce013

docs: add multi-step variant section to README + BLOG_POST e1e3cbe

README: add Scorers section (21 scorers grouped by family) 433bfad

docs: drop the misleading 37× compression anecdote (0-accuracy task) 9867aa7

README + BLOG: drop multi-step + Llama-self mentions (in-progress runs) c3e14ba

remove untested Colab notebook + link training/ folder in README a56bede

trackio: post-hoc replay of train_metrics.jsonl into a HF Space dashboard 3724e90

demo CSVs: add reward_advantage_vs_verbose + accuracy_delta_vs_verbose 7dafc94

README: rewrite for hackathon submission — links-first, plots inline, kill verbose sections a1b7a09

v3: multi-turn env, thinking tokens, cross-family Qwen->Llama, multi-step GRPO 67509ac

Initial commit: Prompt Golf environment for OpenEnv 6850dad

docs: add multi-step training curves to README + BLOG_POST

125b737

README: use escaped \~ for single-tilde approximations

4a5fd24

docs: refresh BLOG_POST stale 87-task numbers; finish README ~ cleanup

802278c

Add training/TRAINING.md — end-to-end reproduction recipe

6206e8a

README: replace ~ with ≈ in intro to fix accidental strikethrough

dec12b4

README: stronger intro

8a2a589

docs: add Prior Work section + replace Training/thinking-mode A/B with multi-step setup

fc2c034

docs: surface multi-turn in How-it-works, defer training-process notes

4023371

docs: consolidate Results section — single master table + per-category + examples

31ce013

docs: add multi-step variant section to README + BLOG_POST

e1e3cbe

README: add Scorers section (21 scorers grouped by family)

433bfad

docs: drop the misleading 37× compression anecdote (0-accuracy task)

9867aa7

README + BLOG: drop multi-step + Llama-self mentions (in-progress runs)

c3e14ba

remove untested Colab notebook + link training/ folder in README

a56bede

trackio: post-hoc replay of train_metrics.jsonl into a HF Space dashboard

3724e90

demo CSVs: add reward_advantage_vs_verbose + accuracy_delta_vs_verbose

7dafc94

README: rewrite for hackathon submission — links-first, plots inline, kill verbose sections

a1b7a09

v3: multi-turn env, thinking tokens, cross-family Qwen->Llama, multi-step GRPO

67509ac

Initial commit: Prompt Golf environment for OpenEnv

6850dad