prompt_golf_env / server

Commit History

tasks_policy: long-context policy-compression tasks
e8ef5c3

Don Rishabh Claude Opus 4.7 (1M context) commited on

v3: multi-turn env, thinking tokens, cross-family Qwen->Llama, multi-step GRPO
67509ac

Don Rishabh Claude Opus 4.7 (1M context) commited on

tasks_tough: add 42 more tough scenarios + baseline profiler
fe54c01

Don Rishabh Claude Opus 4.7 (1M context) commited on

tasks_tough: add 10 domain-classifier tough scenarios (seed batch)
25d9413

Don Rishabh Claude Opus 4.7 (1M context) commited on

Pre-launch fixes: disable Qwen3 thinking, strip think blocks, degenerate-short guard
5abc867

Don Rishabh Claude Opus 4.7 (1M context) commited on

target_model: padding_side='left' (fixes silent corruption on batched decoder-only generation)
e812066

Don Rishabh commited on

Fall back from Qwen3.5 -> Qwen3 family (transformers==4.56.2 compat)
ade2f03

Don Rishabh Claude Opus 4.7 (1M context) commited on

v2 stack: Qwen3.5-2B agent/target, Qwen3.5-9B judge, hard tasks, additive reward
3889513

Don Rishabh Claude Opus 4.7 (1M context) commited on

Initial commit: Prompt Golf environment for OpenEnv
6850dad

Don Rishabh Claude Opus 4.7 (1M context) commited on