Spaces:

rishabh16196
/

prompt_golf_env

Sleeping

App Files Files Community

prompt_golf_env / server

Commit History

tasks_policy: long-context policy-compression tasks

e8ef5c3

Don Rishabh Claude Opus 4.7 (1M context) commited on 13 days ago

v3: multi-turn env, thinking tokens, cross-family Qwen->Llama, multi-step GRPO

67509ac

Don Rishabh Claude Opus 4.7 (1M context) commited on 13 days ago

tasks_tough: add 42 more tough scenarios + baseline profiler

fe54c01

Don Rishabh Claude Opus 4.7 (1M context) commited on 13 days ago

tasks_tough: add 10 domain-classifier tough scenarios (seed batch)

25d9413

Don Rishabh Claude Opus 4.7 (1M context) commited on 13 days ago

Pre-launch fixes: disable Qwen3 thinking, strip think blocks, degenerate-short guard

5abc867

Don Rishabh Claude Opus 4.7 (1M context) commited on 13 days ago

target_model: padding_side='left' (fixes silent corruption on batched decoder-only generation)

e812066

Don Rishabh commited on 13 days ago

Fall back from Qwen3.5 -> Qwen3 family (transformers==4.56.2 compat)

ade2f03

Don Rishabh Claude Opus 4.7 (1M context) commited on 13 days ago

v2 stack: Qwen3.5-2B agent/target, Qwen3.5-9B judge, hard tasks, additive reward

3889513

Don Rishabh Claude Opus 4.7 (1M context) commited on 13 days ago

Initial commit: Prompt Golf environment for OpenEnv

6850dad

Don Rishabh Claude Opus 4.7 (1M context) commited on 14 days ago