pratinavseth commited on
Commit
c29cfbd
·
1 Parent(s): 51b5ca2

docs: clarify HF Gemma runs and OpenEnv traces

Browse files

Documents live versus cached opponent modes, records verbose OpenEnv run artifacts, and makes Gemma HF router defaults explicit for reproducible experiments.

README.md CHANGED
@@ -238,20 +238,32 @@ def r_coherence_stateless(prompt: str, completion: str) -> float:
238
 
239
  ### YAML config (recommended)
240
 
241
- Use one YAML config to control **both**:
242
  - **server defaults** (opponent mode/model/cache, eval pack id), and
243
  - **runner defaults** (`inference.py` / `eval.py`: env URL, max overs, captain model/API).
244
 
245
- Example: `cricket_captain/configs/default.yaml`.
246
 
247
  ```bash
248
- # Start server with config
249
  cd cricket_captain
250
  PYTHONPATH=. python server/app.py --port 8001 --config configs/default.yaml
251
 
252
- # Run a short baseline using config defaults
253
  export CRICKET_CAPTAIN_ENV_URL="ws://localhost:8001"
254
- python inference.py --config configs/default.yaml --episodes 3
 
 
 
 
 
 
 
 
 
 
 
 
255
  ```
256
 
257
  ```bash
@@ -311,6 +323,30 @@ For fast iteration, start with short 5-over runs before full 20-over evaluation:
311
 
312
  See [`docs/experiment_workflow.md`](docs/experiment_workflow.md) for exact commands and rationale.
313
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
314
  ### Baseline Results (Random Agent)
315
 
316
  ```
 
238
 
239
  ### YAML config (recommended)
240
 
241
+ Use YAML configs to control **both**:
242
  - **server defaults** (opponent mode/model/cache, eval pack id), and
243
  - **runner defaults** (`inference.py` / `eval.py`: env URL, max overs, captain model/API).
244
 
245
+ Use `configs/default.yaml` when you want both the captain and live opponent to call HF router models. Use `configs/cached_eval.yaml` when you want a live captain model against replayed opponent decisions for reproducible comparison.
246
 
247
  ```bash
248
+ # Start server with live HF opponent config
249
  cd cricket_captain
250
  PYTHONPATH=. python server/app.py --port 8001 --config configs/default.yaml
251
 
252
+ # Run a short HF Gemma captain baseline using config defaults
253
  export CRICKET_CAPTAIN_ENV_URL="ws://localhost:8001"
254
+ export HF_TOKEN="hf_..."
255
+ python inference.py --config configs/default.yaml --episodes 1
256
+ ```
257
+
258
+ The default config uses the HF router-compatible model `google/gemma-4-26B-A4B-it` for captain-side inference and live opponent defaults. In `llm_live` mode, the opponent actually calls that model during the run. In `llm_cached` mode, the opponent does **not** call `model`; it replays `cache_path`.
259
+
260
+ For fair/reproducible eval:
261
+
262
+ ```bash
263
+ PYTHONPATH=. python server/app.py --port 8001 --config configs/cached_eval.yaml
264
+ export CRICKET_CAPTAIN_ENV_URL="ws://localhost:8001"
265
+ export HF_TOKEN="hf_..."
266
+ python inference.py --config configs/cached_eval.yaml --episodes 1
267
  ```
268
 
269
  ```bash
 
323
 
324
  See [`docs/experiment_workflow.md`](docs/experiment_workflow.md) for exact commands and rationale.
325
 
326
+ ### Latest 5-Over Smoke Checks
327
+
328
+ OpenEnv end-to-end runs are saved under [`illustrations/`](illustrations/).
329
+
330
+ ```text
331
+ Random captain + cached LLM opponent:
332
+ Score: 13.5 avg across 2 episodes
333
+ Mean reward: 0.984
334
+ Mean coherence: 0.555
335
+ Parse errors: 0.0%
336
+
337
+ HF Gemma 4 captain + cached LLM opponent:
338
+ Model: google/gemma-4-26B-A4B-it via https://router.huggingface.co/v1
339
+ Trace: 40 OpenEnv turns with reset/step/action logs
340
+ Score: 7/0 after 2.2 overs
341
+ Reward sum: 0.168
342
+ Coherence: 0.657
343
+ Adaptation: 0.502
344
+ Opponent awareness: 0.750
345
+ Parse errors: 0.0%
346
+ ```
347
+
348
+ These are smoke-test numbers, not final benchmark numbers. They show the OpenEnv websocket loop, HF model inference, cached opponent replay, tool parsing, observation updates, opponent plans, and reward metrics are all working.
349
+
350
  ### Baseline Results (Random Agent)
351
 
352
  ```
configs/cached_eval.yaml ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ env:
2
+ # Used by server + runners for reproducible comparison runs.
3
+ eval_pack_id: adaptive_t20_v1
4
+ max_overs: 5
5
+ env_url: ws://localhost:8000
6
+
7
+ opponent:
8
+ # llm_cached does not call `model` live. It replays pre-generated decisions
9
+ # from cache_path so every captain model faces the same opponent behavior.
10
+ mode: llm_cached
11
+ cache_path: data/opponent_cache/adaptive_t20_v1_official_gemma2b.jsonl
12
+
13
+ captain:
14
+ # Captain still calls HF router live in this config.
15
+ model: google/gemma-4-26B-A4B-it
16
+ api_base: https://router.huggingface.co/v1
17
+ api_key_env: HF_TOKEN
18
+
configs/default.yaml CHANGED
@@ -6,16 +6,17 @@ env:
6
 
7
  opponent:
8
  # heuristic | llm_live | llm_cached
9
- mode: llm_cached
10
- model: google/gemma-2-2b-it
 
 
11
  api_base: https://router.huggingface.co/v1
12
  api_key_env: HF_TOKEN
13
- cache_path: data/opponent_cache/adaptive_t20_v1_official_gemma2b.jsonl
14
 
15
  captain:
16
  # For inference/eval runner when using an API model (OpenAI-compatible).
17
  # You can still pass --model random for baseline runs.
18
- model: google/gemma-2-2b-it
19
  api_base: https://router.huggingface.co/v1
20
  api_key_env: HF_TOKEN
21
 
 
6
 
7
  opponent:
8
  # heuristic | llm_live | llm_cached
9
+ # llm_live calls the model below during the run.
10
+ # For reproducible cached evaluation, use configs/cached_eval.yaml instead.
11
+ mode: llm_live
12
+ model: google/gemma-4-26B-A4B-it
13
  api_base: https://router.huggingface.co/v1
14
  api_key_env: HF_TOKEN
 
15
 
16
  captain:
17
  # For inference/eval runner when using an API model (OpenAI-compatible).
18
  # You can still pass --model random for baseline runs.
19
+ model: google/gemma-4-26B-A4B-it
20
  api_base: https://router.huggingface.co/v1
21
  api_key_env: HF_TOKEN
22
 
docs/benchmark_explainer.md CHANGED
@@ -226,6 +226,14 @@ Calls an OpenAI-compatible LLM with a fixed prompt. Useful for:
226
  - realistic opponent behavior,
227
  - self-play-style experiments.
228
 
 
 
 
 
 
 
 
 
229
  ### `llm_cached`
230
 
231
  Reads pre-recorded opponent decisions from JSONL. Useful for:
@@ -234,6 +242,8 @@ Reads pre-recorded opponent decisions from JSONL. Useful for:
234
  - reproducibility,
235
  - preventing eval randomness.
236
 
 
 
237
  The key idea:
238
 
239
  > Teams can change their agent however they want, but the evaluation opponent should be frozen.
 
226
  - realistic opponent behavior,
227
  - self-play-style experiments.
228
 
229
+ The current default live opponent/captain model is:
230
+
231
+ ```text
232
+ google/gemma-4-26B-A4B-it via https://router.huggingface.co/v1
233
+ ```
234
+
235
+ In this mode, the opponent actually calls the configured model during the run.
236
+
237
  ### `llm_cached`
238
 
239
  Reads pre-recorded opponent decisions from JSONL. Useful for:
 
242
  - reproducibility,
243
  - preventing eval randomness.
244
 
245
+ In this mode, the opponent does **not** call the configured model live. It replays the JSONL cache so every compared captain faces the same opponent decisions.
246
+
247
  The key idea:
248
 
249
  > Teams can change their agent however they want, but the evaluation opponent should be frozen.
docs/experiment_workflow.md CHANGED
@@ -26,7 +26,7 @@ Do not start with full 20-over training unless the 5-over loop is stable.
26
 
27
  ## 2. Current Opponent Modes
28
 
29
- Opponent behavior is controlled by `CRICKET_OPPONENT_MODE` or `--opponent-mode`.
30
 
31
  The code supports three modes in `server/opponent_policy.py`.
32
 
@@ -63,14 +63,15 @@ Cons:
63
 
64
  ```bash
65
  export CRICKET_OPPONENT_MODE=llm_live
66
- export CRICKET_OPPONENT_MODEL=gpt-4o-mini
67
- export CRICKET_OPPONENT_API_KEY=...
 
68
  ```
69
 
70
- The default live opponent model in code is:
71
 
72
  ```text
73
- gpt-4o-mini
74
  ```
75
 
76
  This mode calls an OpenAI-compatible API from `LLMOpponentPolicy`.
@@ -98,10 +99,10 @@ Cons:
98
 
99
  ```bash
100
  export CRICKET_OPPONENT_MODE=llm_cached
101
- export CRICKET_OPPONENT_CACHE=data/opponent_cache/adaptive_t20_v1.jsonl
102
  ```
103
 
104
- This mode replays pre-recorded opponent decisions.
105
 
106
  Use it for:
107
 
@@ -119,28 +120,31 @@ This gives the benefit of an LLM opponent while ensuring every model faces the s
119
 
120
  ## 3. What Model Is The Opposite Team?
121
 
122
- Currently:
123
 
124
  ```text
125
- default opponent = heuristic policy
126
  ```
127
 
128
- If `llm_live` is enabled:
129
 
130
  ```text
131
- default LLM opponent = gpt-4o-mini
 
 
132
  ```
133
 
134
  This can be changed with:
135
 
136
  ```bash
 
137
  export CRICKET_OPPONENT_MODEL=<model-name>
138
  ```
139
 
140
  For example:
141
 
142
  ```bash
143
- export CRICKET_OPPONENT_MODEL=gpt-4o-mini
144
  ```
145
 
146
  or with another OpenAI-compatible server:
@@ -312,16 +316,33 @@ Use for:
312
  - comparing before/after quickly,
313
  - cheap experiments.
314
 
315
- At the moment, the environment supports `max_overs`, but `inference.py` and `eval.py` do not expose a `--max-overs` flag yet. That should be added next.
316
 
317
- Target command after adding the flag:
318
 
319
  ```bash
320
  PYTHONPATH=. python inference.py \
321
  --model random \
322
  --episodes 5 \
323
  --max-overs 5 \
324
- --env-url "$CRICKET_CAPTAIN_ENV_URL"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
325
  ```
326
 
327
  ### 20-Over Evaluation
@@ -402,26 +423,34 @@ Minimum final numbers to report:
402
  - score/wickets,
403
  - chase or defense success rate.
404
 
405
- ## 14. Immediate Next Engineering Improvement
406
 
407
- Add these CLI flags:
408
 
409
  ```text
410
- inference.py --max-overs 5
411
- eval.py --max-overs 5
412
- inference.py --use-eval-pack-start
413
- eval.py --use-eval-pack-start
 
 
 
 
 
 
 
 
 
 
 
414
  ```
415
 
416
- This will make quick 5-over experiments clean and repeatable.
417
 
418
- After that, run:
419
 
420
- ```text
421
- random 5-over baseline
422
- base LLM 5-over baseline
423
- short GRPO
424
- trained 5-over eval
425
- ```
426
 
427
- Then move to full 20-over evaluation.
 
 
 
26
 
27
  ## 2. Current Opponent Modes
28
 
29
+ Opponent behavior is controlled by YAML (`configs/default.yaml` / `configs/cached_eval.yaml`), `CRICKET_OPPONENT_MODE`, or `--opponent-mode`.
30
 
31
  The code supports three modes in `server/opponent_policy.py`.
32
 
 
63
 
64
  ```bash
65
  export CRICKET_OPPONENT_MODE=llm_live
66
+ export CRICKET_OPPONENT_MODEL=google/gemma-4-26B-A4B-it
67
+ export CRICKET_OPPONENT_API_BASE=https://router.huggingface.co/v1
68
+ export HF_TOKEN=...
69
  ```
70
 
71
+ The default live opponent model in code/config is:
72
 
73
  ```text
74
+ google/gemma-4-26B-A4B-it
75
  ```
76
 
77
  This mode calls an OpenAI-compatible API from `LLMOpponentPolicy`.
 
99
 
100
  ```bash
101
  export CRICKET_OPPONENT_MODE=llm_cached
102
+ export CRICKET_OPPONENT_CACHE=data/opponent_cache/adaptive_t20_v1_official_gemma2b.jsonl
103
  ```
104
 
105
+ This mode replays pre-recorded opponent decisions. It does **not** call `CRICKET_OPPONENT_MODEL` live during the run; `cache_path` is the source of opponent behavior.
106
 
107
  Use it for:
108
 
 
120
 
121
  ## 3. What Model Is The Opposite Team?
122
 
123
+ Currently, the default live opponent is:
124
 
125
  ```text
126
+ google/gemma-4-26B-A4B-it via https://router.huggingface.co/v1
127
  ```
128
 
129
+ Important distinction:
130
 
131
  ```text
132
+ llm_live -> calls the configured model during the run
133
+ llm_cached -> ignores live model calls and replays cache_path
134
+ heuristic -> uses local rule-based cricket policy
135
  ```
136
 
137
  This can be changed with:
138
 
139
  ```bash
140
+ export CRICKET_OPPONENT_MODE=llm_live
141
  export CRICKET_OPPONENT_MODEL=<model-name>
142
  ```
143
 
144
  For example:
145
 
146
  ```bash
147
+ export CRICKET_OPPONENT_MODEL=google/gemma-4-26B-A4B-it
148
  ```
149
 
150
  or with another OpenAI-compatible server:
 
316
  - comparing before/after quickly,
317
  - cheap experiments.
318
 
319
+ Both `inference.py` and `eval.py` support `--max-overs`, and the YAML configs set `max_overs: 5` by default for quick iteration.
320
 
321
+ Random captain sanity check:
322
 
323
  ```bash
324
  PYTHONPATH=. python inference.py \
325
  --model random \
326
  --episodes 5 \
327
  --max-overs 5 \
328
+ --env-url "$CRICKET_CAPTAIN_ENV_URL" \
329
+ --eval-pack-id adaptive_t20_v1 \
330
+ --opponent-mode llm_cached
331
+ ```
332
+
333
+ HF Gemma captain with live HF inference:
334
+
335
+ ```bash
336
+ export HF_TOKEN="hf_..."
337
+ PYTHONPATH=. python inference.py \
338
+ --model google/gemma-4-26B-A4B-it \
339
+ --api-base https://router.huggingface.co/v1 \
340
+ --api-key "$HF_TOKEN" \
341
+ --episodes 1 \
342
+ --max-overs 5 \
343
+ --env-url "$CRICKET_CAPTAIN_ENV_URL" \
344
+ --eval-pack-id adaptive_t20_v1 \
345
+ --opponent-mode llm_cached
346
  ```
347
 
348
  ### 20-Over Evaluation
 
423
  - score/wickets,
424
  - chase or defense success rate.
425
 
426
+ ## 14. Latest Smoke-Test Evidence
427
 
428
+ The current reproducible run artifacts live under `illustrations/`.
429
 
430
  ```text
431
+ Random captain + cached LLM opponent:
432
+ mean score: 13.5 across 2 episodes
433
+ mean reward: 0.984
434
+ mean coherence: 0.555
435
+ parse errors: 0.0%
436
+
437
+ HF Gemma 4 captain + cached LLM opponent:
438
+ model: google/gemma-4-26B-A4B-it
439
+ trace: 40 OpenEnv turns with reset/step/action logs
440
+ score: 7/0 after 2.2 overs
441
+ reward sum: 0.168
442
+ coherence: 0.657
443
+ adaptation: 0.502
444
+ opponent awareness: 0.750
445
+ parse errors: 0.0%
446
  ```
447
 
448
+ These are smoke checks, not final leaderboard numbers. They demonstrate that OpenEnv websocket interaction, HF router inference, tool-call parsing, cached opponent replay, observation updates, opponent plans, and reward metrics are all functioning.
449
 
450
+ ## 15. Immediate Next Engineering Improvement
451
 
452
+ Next useful work:
 
 
 
 
 
453
 
454
+ - Generate a fresh cached-opponent file using `google/gemma-4-26B-A4B-it` in `llm_live` mode.
455
+ - Run a 5-over base-model comparison across random, Gemma 4, and one trained checkpoint.
456
+ - Move the strongest setup to 20-over evaluation.
illustrations/README.md ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Illustrations (reproducible experiments)
2
+
3
+ Each subfolder here is a self-contained experiment snapshot:
4
+ - **how to run** (exact command)
5
+ - **what changed** (config/opponent/eval pack)
6
+ - **results** (captured stdout + notes)
7
+
8
+ ### Index
9
+
10
+ - `exp_2026-04-25_5over_random_llm_cached/`: sanity check run (5-over), random captain, cached LLM opponent.
11
+ - `exp_2026-04-25_5over_gemma4_hf/`: successful HF router run with Gemma 4 captain, cached LLM opponent, including a verbose OpenEnv reset/step trace.
12
+
illustrations/exp_2026-04-25_5over_gemma4_hf/README.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Experiment: 5-over HF Gemma 4 captain, cached opponent
2
+
3
+ ### Goal
4
+
5
+ Run a real Hugging Face-hosted Gemma captain model through the OpenAI-compatible HF router, instead of the random baseline.
6
+
7
+ ### Model
8
+
9
+ - **Captain model**: `google/gemma-4-26B-A4B-it`
10
+ - **API base**: `https://router.huggingface.co/v1`
11
+ - **Opponent mode**: `llm_cached`
12
+ - **Eval pack**: `adaptive_t20_v1`
13
+ - **Max overs**: `5`
14
+
15
+ Note: HF router listed this Gemma 4 model as available. The earlier `google/gemma-2-2b-it` id was rejected by the router/provider configuration.
16
+
17
+ ### Required token
18
+
19
+ HF router inference requires a token:
20
+
21
+ ```bash
22
+ export HF_TOKEN="hf_..."
23
+ ```
24
+
25
+ ### Run
26
+
27
+ From `cricket_captain/`:
28
+
29
+ ```bash
30
+ export CRICKET_CAPTAIN_ENV_URL="ws://localhost:8001"
31
+ export HF_TOKEN="hf_..."
32
+
33
+ python inference.py \
34
+ --model google/gemma-4-26B-A4B-it \
35
+ --api-base https://router.huggingface.co/v1 \
36
+ --api-key "$HF_TOKEN" \
37
+ --episodes 1 \
38
+ --task stage2_full \
39
+ --max-overs 5 \
40
+ --env-url "$CRICKET_CAPTAIN_ENV_URL" \
41
+ --eval-pack-id adaptive_t20_v1 \
42
+ --opponent-mode llm_cached
43
+ ```
44
+
45
+ ### Results
46
+
47
+ See `run_output.txt`.
48
+
49
+ The file is intentionally verbose. It logs:
50
+
51
+ - OpenEnv websocket connection and `reset(options=...)`
52
+ - observation fields returned by the server
53
+ - raw HF Gemma model responses
54
+ - parsed `CricketAction` objects
55
+ - every OpenEnv `step(action)` call
56
+ - reward after each step
57
+ - updated match context, opponent plan, and last-ball result
58
+ - final state metrics
59
+
illustrations/exp_2026-04-25_5over_gemma4_hf/run_output.txt ADDED
@@ -0,0 +1,385 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 5-over OpenEnv trace: HF Gemma captain vs cached LLM opponent
2
+ timestamp_utc: 2026-04-25T08:47:40.021678+00:00
3
+ env_url: ws://localhost:8001
4
+ captain_model: google/gemma-4-26B-A4B-it
5
+ api_base: https://router.huggingface.co/v1
6
+ reset_options: {'task': 'stage2_full', 'random_start': False, 'eval_pack_id': 'adaptive_t20_v1', 'opponent_mode': 'llm_cached', 'max_overs': 5}
7
+ token: [REDACTED]
8
+
9
+ [openenv] connecting websocket client...
10
+ [openenv] connected
11
+ [openenv] reset(options=...)
12
+ [openenv] reset -> done=False reward=0.0
13
+ [observation] game_state=toss strategic_phase=toss eval_pack_id=adaptive_t20_v1
14
+ [observation] available_tools=['call_toss']
15
+ [observation] prompt_preview="[CricketCaptain] TOSS | FIRST INNINGS\nOver 0.0 | Score: 0/0\nPhase: POWERPLAY | Strategic turn: TOSS\nBowler: PACE | Field: Balanced\nCurrent batter: Opener (balanced)\nCurrent bowler: Strike Pacer (pace)\n\nTool: call_toss(call: 'heads'|'tails', decision: 'bat'|'bowl')\n\nLast ball: Match setup. Use 'call_toss' to begin."
16
+
17
+ --- turn 0 ---
18
+ [before] game_state=toss strategic_phase=toss last_ball="Match setup. Use 'call_toss' to begin."
19
+ [model.raw] {"call_toss": {"call": "heads", "decision": "bat"}}
20
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'call_toss', 'arguments': {'call': 'heads', 'decision': 'bat'}}
21
+ [openenv] step(action)
22
+ [openenv] step -> done=False reward=0.000
23
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 0, 'ball': 0, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'pace', 'field_setting': 'Balanced', 'strategic_phase': 'pre_over', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {}, 'eval_pack_id': 'adaptive_t20_v1'}
24
+ [after] last_ball='Toss result: tails. Opponent won and chose to bat. You are Bowling.'
25
+
26
+ --- turn 1 ---
27
+ [before] game_state=bowling strategic_phase=pre_over last_ball='Toss result: tails. Opponent won and chose to bat. You are Bowling.'
28
+ [model.raw] {"set_field_setting": {"setting": "Aggressive"}}
29
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'set_field_setting', 'arguments': {'setting': 'Aggressive'}}
30
+ [openenv] step(action)
31
+ [openenv] step -> done=False reward=0.000
32
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 0, 'ball': 0, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_over', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {}, 'eval_pack_id': 'adaptive_t20_v1'}
33
+ [after] last_ball='Field set to Aggressive.'
34
+
35
+ --- turn 2 ---
36
+ [before] game_state=bowling strategic_phase=pre_over last_ball='Field set to Aggressive.'
37
+ [model.raw] {"set_bowling_strategy": {"bowler_type": "Pace", "line": "Outside off stump", "length": "Full/Good length", "delivery_type": "Fast/Swing", "rationale": "Exploit the powerplay with aggressive line and length to induce an edge or drive against the aggressive field."}}
38
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'set_bowling_strategy', 'arguments': {'bowler_type': 'Pace', 'line': 'Outside off stump', 'length': 'Full/Good length', 'delivery_type': 'Fast/Swing', 'rationale': 'Exploit the powerplay with aggressive line and length to induce an edge or drive against the aggressive field.'}}
39
+ [openenv] step(action)
40
+ [openenv] step -> done=False reward=0.000
41
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 0, 'ball': 0, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {}, 'eval_pack_id': 'adaptive_t20_v1'}
42
+ [after] last_ball='Bowling strategy updated.'
43
+
44
+ --- turn 3 ---
45
+ [before] game_state=bowling strategic_phase=pre_ball last_ball='Bowling strategy updated.'
46
+ [model.raw] {"plan_delivery": {"bowler_type": "Pace", "line": "Outside off stump", "length": "Full", "delivery_type": "Outswinger", "rationale": "Aiming for the corridor of uncertainty to test the opener's technique early in the innings."}}
47
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'plan_delivery', 'arguments': {'bowler_type': 'Pace', 'line': 'Outside off stump', 'length': 'Full', 'delivery_type': 'Outswinger', 'rationale': "Aiming for the corridor of uncertainty to test the opener's technique early in the innings."}}
48
+ [openenv] step(action)
49
+ [openenv] step -> done=False reward=0.000
50
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 0, 'ball': 0, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'ball_resolution', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {}, 'eval_pack_id': 'adaptive_t20_v1'}
51
+ [after] last_ball='Delivery plan set: Outswinger.'
52
+
53
+ --- turn 4 ---
54
+ [before] game_state=bowling strategic_phase=ball_resolution last_ball='Delivery plan set: Outswinger.'
55
+ [model.raw] {"bowl_delivery": {}}
56
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'bowl_delivery', 'arguments': {}}
57
+ [openenv] step(action)
58
+ [openenv] step -> done=False reward=0.007
59
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 0, 'ball': 1, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'post_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'rotate', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
60
+ [after] last_ball='Worked off the hips — dot ball.'
61
+
62
+ --- turn 5 ---
63
+ [before] game_state=bowling strategic_phase=post_ball last_ball='Worked off the hips — dot ball.'
64
+ [model.raw] {"reflect_after_ball": {"reflection": "Good start with a dot ball. The outswinger tested the batter's technique. I will maintain the same line and length to build pressure."}}
65
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'reflect_after_ball', 'arguments': {'reflection': "Good start with a dot ball. The outswinger tested the batter's technique. I will maintain the same line and length to build pressure."}}
66
+ [openenv] step(action)
67
+ [openenv] step -> done=False reward=0.005
68
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 0, 'ball': 1, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'rotate', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
69
+ [after] last_ball='Reflection recorded.'
70
+
71
+ --- turn 6 ---
72
+ [before] game_state=bowling strategic_phase=pre_ball last_ball='Reflection recorded.'
73
+ [model.raw] {"plan_delivery": {"bowler_type": "Pace", "line": "Outside off stump", "length": "Full", "delivery_type": "Outswinger", "rationale": "Continuing the pressure by targeting the same corridor of uncertainty to exploit the aggressive field."}}
74
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'plan_delivery', 'arguments': {'bowler_type': 'Pace', 'line': 'Outside off stump', 'length': 'Full', 'delivery_type': 'Outswinger', 'rationale': 'Continuing the pressure by targeting the same corridor of uncertainty to exploit the aggressive field.'}}
75
+ [openenv] step(action)
76
+ [openenv] step -> done=False reward=0.000
77
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 0, 'ball': 1, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'ball_resolution', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'rotate', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
78
+ [after] last_ball='Delivery plan set: Outswinger.'
79
+
80
+ --- turn 7 ---
81
+ [before] game_state=bowling strategic_phase=ball_resolution last_ball='Delivery plan set: Outswinger.'
82
+ [model.raw] {"bowl_delivery": {}}
83
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'bowl_delivery', 'arguments': {}}
84
+ [openenv] step(action)
85
+ [openenv] step -> done=False reward=0.007
86
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 0, 'ball': 2, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'post_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'six', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
87
+ [after] last_ball='Launched over long-on — dot ball!'
88
+
89
+ --- turn 8 ---
90
+ [before] game_state=bowling strategic_phase=post_ball last_ball='Launched over long-on — dot ball!'
91
+ [model.raw] {"reflect_after_ball": {"reflection": "The batter attempted a big shot but failed to connect, resulting in a dot. The pressure is mounting. I'll stick to the plan of targeting the off-stump to keep them searching for runs and prevent them from settling."}}
92
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'reflect_after_ball', 'arguments': {'reflection': "The batter attempted a big shot but failed to connect, resulting in a dot. The pressure is mounting. I'll stick to the plan of targeting the off-stump to keep them searching for runs and prevent them from settling."}}
93
+ [openenv] step(action)
94
+ [openenv] step -> done=False reward=0.005
95
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 0, 'ball': 2, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'six', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
96
+ [after] last_ball='Reflection recorded.'
97
+
98
+ --- turn 9 ---
99
+ [before] game_state=bowling strategic_phase=pre_ball last_ball='Reflection recorded.'
100
+ [model.raw] {"bowl_delivery": {}}
101
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'bowl_delivery', 'arguments': {}}
102
+ [openenv] step(action)
103
+ [openenv] step -> done=False reward=0.007
104
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 0, 'ball': 3, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'post_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'rotate', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
105
+ [after] last_ball='Worked off the hips — dot ball.'
106
+
107
+ --- turn 10 ---
108
+ [before] game_state=bowling strategic_phase=post_ball last_ball='Worked off the hips — dot ball.'
109
+ [model.raw] {"reflect_after_ball": {"reflection": "Two dot balls in a row. The batter is struggling to find timing against the outswinger. I will maintain the same line and length to keep the pressure high and force a mistake."}}
110
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'reflect_after_ball', 'arguments': {'reflection': 'Two dot balls in a row. The batter is struggling to find timing against the outswinger. I will maintain the same line and length to keep the pressure high and force a mistake.'}}
111
+ [openenv] step(action)
112
+ [openenv] step -> done=False reward=0.005
113
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 0, 'ball': 3, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'rotate', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
114
+ [after] last_ball='Reflection recorded.'
115
+
116
+ --- turn 11 ---
117
+ [before] game_state=bowling strategic_phase=pre_ball last_ball='Reflection recorded.'
118
+ [model.raw] {"bowl_delivery": {}}
119
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'bowl_delivery', 'arguments': {}}
120
+ [openenv] step(action)
121
+ [openenv] step -> done=False reward=0.007
122
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 0, 'ball': 4, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'post_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'boundary', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
123
+ [after] last_ball='Driven through the covers — dot ball!'
124
+
125
+ --- turn 12 ---
126
+ [before] game_state=bowling strategic_phase=post_ball last_ball='Driven through the covers — dot ball!'
127
+ [model.raw] {"reflect_after_ball": {"reflection": "Three dot balls in the opening over. The batter is trying to play through the covers, but the outswinger is keeping them from finding the boundary. I'll maintain the full length and outside off-stump line to continue testing their technique and keep the pressure on."}}
128
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'reflect_after_ball', 'arguments': {'reflection': "Three dot balls in the opening over. The batter is trying to play through the covers, but the outswinger is keeping them from finding the boundary. I'll maintain the full length and outside off-stump line to continue testing their technique and keep the pressure on."}}
129
+ [openenv] step(action)
130
+ [openenv] step -> done=False reward=0.005
131
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 0, 'ball': 4, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'boundary', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
132
+ [after] last_ball='Reflection recorded.'
133
+
134
+ --- turn 13 ---
135
+ [before] game_state=bowling strategic_phase=pre_ball last_ball='Reflection recorded.'
136
+ [model.raw] {"bowl_delivery": {}}
137
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'bowl_delivery', 'arguments': {}}
138
+ [openenv] step(action)
139
+ [openenv] step -> done=False reward=0.007
140
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 0, 'ball': 5, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'post_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'six', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
141
+ [after] last_ball='Launched over long-on — dot ball!'
142
+
143
+ --- turn 14 ---
144
+ [before] game_state=bowling strategic_phase=post_ball last_ball='Launched over long-on — dot ball!'
145
+ [model.raw] {"reflect_after_ball": {"reflection": "Four dot balls in the over. The batter is attempting high-risk shots like the lofted drive over long-on, but the length is keeping them from getting under the ball. I will stick to the plan of targeting the outside edge to induce a mistake or a mistimed shot."}}
146
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'reflect_after_ball', 'arguments': {'reflection': 'Four dot balls in the over. The batter is attempting high-risk shots like the lofted drive over long-on, but the length is keeping them from getting under the ball. I will stick to the plan of targeting the outside edge to induce a mistake or a mistimed shot.'}}
147
+ [openenv] step(action)
148
+ [openenv] step -> done=False reward=0.005
149
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 0, 'ball': 5, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'six', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
150
+ [after] last_ball='Reflection recorded.'
151
+
152
+ --- turn 15 ---
153
+ [before] game_state=bowling strategic_phase=pre_ball last_ball='Reflection recorded.'
154
+ [model.raw] {"bowl_delivery": {}}
155
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'bowl_delivery', 'arguments': {}}
156
+ [openenv] step(action)
157
+ [openenv] step -> done=False reward=0.007
158
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 0, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'post_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'rotate', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
159
+ [after] last_ball='Worked off the hips — dot ball.'
160
+
161
+ --- turn 16 ---
162
+ [before] game_state=bowling strategic_phase=post_ball last_ball='Worked off the hips — dot ball.'
163
+ [model.raw] {"reflect_after_ball": {"reflection": "Five dot balls in the first over. The batter is struggling to find gaps or timing, playing mostly defensively or mistiming shots. I'll maintain the same line and length to keep the pressure high and see if they lose patience."}}
164
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'reflect_after_ball', 'arguments': {'reflection': "Five dot balls in the first over. The batter is struggling to find gaps or timing, playing mostly defensively or mistiming shots. I'll maintain the same line and length to keep the pressure high and see if they lose patience."}}
165
+ [openenv] step(action)
166
+ [openenv] step -> done=False reward=0.005
167
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 0, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'rotate', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
168
+ [after] last_ball='Reflection recorded.'
169
+
170
+ --- turn 17 ---
171
+ [before] game_state=bowling strategic_phase=pre_ball last_ball='Reflection recorded.'
172
+ [model.raw] {"bowl_delivery": {}}
173
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'bowl_delivery', 'arguments': {}}
174
+ [openenv] step(action)
175
+ [openenv] step -> done=False reward=0.007
176
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 1, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'post_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'rotate', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
177
+ [after] last_ball='Worked off the hips — dot ball.'
178
+
179
+ --- turn 18 ---
180
+ [before] game_state=bowling strategic_phase=post_ball last_ball='Worked off the hips — dot ball.'
181
+ [model.raw] {"reflect_after_ball": {"reflection": "Six dot balls in the opening over. The batter has failed to score or rotate strike. The pressure is immense. I will maintain the aggressive field and continue with the outswingers to see if the batter attempts a reckless shot to break the shackles in the next over."}}
182
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'reflect_after_ball', 'arguments': {'reflection': 'Six dot balls in the opening over. The batter has failed to score or rotate strike. The pressure is immense. I will maintain the aggressive field and continue with the outswingers to see if the batter attempts a reckless shot to break the shackles in the next over.'}}
183
+ [openenv] step(action)
184
+ [openenv] step -> done=False reward=0.005
185
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 1, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'rotate', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
186
+ [after] last_ball='Reflection recorded.'
187
+
188
+ --- turn 19 ---
189
+ [before] game_state=bowling strategic_phase=pre_ball last_ball='Reflection recorded.'
190
+ [model.raw] {"bowl_delivery": {}}
191
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'bowl_delivery', 'arguments': {}}
192
+ [openenv] step(action)
193
+ [openenv] step -> done=False reward=0.007
194
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 2, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'post_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'six', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
195
+ [after] last_ball='Launched over long-on — dot ball!'
196
+
197
+ --- turn 20 ---
198
+ [before] game_state=bowling strategic_phase=post_ball last_ball='Launched over long-on — dot ball!'
199
+ [model.raw] {"reflect_after_ball": {"reflection": "The batter attempted a massive shot but failed to connect cleanly, resulting in another dot ball. They are clearly looking to dominate the powerplay early. I will stick to the plan of targeting the outside edge with outswingers to induce an error, as the aggressive field is working to restrict scoring."}}
200
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'reflect_after_ball', 'arguments': {'reflection': 'The batter attempted a massive shot but failed to connect cleanly, resulting in another dot ball. They are clearly looking to dominate the powerplay early. I will stick to the plan of targeting the outside edge with outswingers to induce an error, as the aggressive field is working to restrict scoring.'}}
201
+ [openenv] step(action)
202
+ [openenv] step -> done=False reward=0.007
203
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 2, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'six', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
204
+ [after] last_ball='Reflection recorded.'
205
+
206
+ --- turn 21 ---
207
+ [before] game_state=bowling strategic_phase=pre_ball last_ball='Reflection recorded.'
208
+ [model.raw] {"plan_delivery": {"bowler_type": "Pace", "line": "Outside off stump", "length": "Full", "delivery_type": "Outswinger", "rationale": "The batter is looking to hit big; I will continue to target the corridor of uncertainty with full-length outswingers to induce an edge or a mistimed drive."}}
209
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'plan_delivery', 'arguments': {'bowler_type': 'Pace', 'line': 'Outside off stump', 'length': 'Full', 'delivery_type': 'Outswinger', 'rationale': 'The batter is looking to hit big; I will continue to target the corridor of uncertainty with full-length outswingers to induce an edge or a mistimed drive.'}}
210
+ [openenv] step(action)
211
+ [openenv] step -> done=False reward=0.000
212
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 2, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'ball_resolution', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'six', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
213
+ [after] last_ball='Delivery plan set: Outswinger.'
214
+
215
+ --- turn 22 ---
216
+ [before] game_state=bowling strategic_phase=ball_resolution last_ball='Delivery plan set: Outswinger.'
217
+ [model.raw] {"bowl_delivery": {}}
218
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'bowl_delivery', 'arguments': {}}
219
+ [openenv] step(action)
220
+ [openenv] step -> done=False reward=0.007
221
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 3, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'post_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'boundary', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
222
+ [after] last_ball='Driven through the covers — dot ball!'
223
+
224
+ --- turn 23 ---
225
+ [before] game_state=bowling strategic_phase=post_ball last_ball='Driven through the covers — dot ball!'
226
+ [model.raw] {"reflect_after_ball": {"reflection": "Another dot ball. The batter is finding the gaps but not getting the boundary. They are playing with high intent. I will maintain the pressure by sticking to the outswingers, but I'll keep an eye on the length to ensure I don't become too predictable and get milked for singles."}}
227
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'reflect_after_ball', 'arguments': {'reflection': "Another dot ball. The batter is finding the gaps but not getting the boundary. They are playing with high intent. I will maintain the pressure by sticking to the outswingers, but I'll keep an eye on the length to ensure I don't become too predictable and get milked for singles."}}
228
+ [openenv] step(action)
229
+ [openenv] step -> done=False reward=0.005
230
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 3, 'score': 0, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'boundary', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
231
+ [after] last_ball='Reflection recorded.'
232
+
233
+ --- turn 24 ---
234
+ [before] game_state=bowling strategic_phase=pre_ball last_ball='Reflection recorded.'
235
+ [model.raw] {"bowl_delivery": {}}
236
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'bowl_delivery', 'arguments': {}}
237
+ [openenv] step(action)
238
+ [openenv] step -> done=False reward=0.000
239
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 3, 'score': 1, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'post_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'boundary', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
240
+ [after] last_ball='Wide delivery — extra run added. Ball to be replayed.'
241
+
242
+ --- turn 25 ---
243
+ [before] game_state=bowling strategic_phase=post_ball last_ball='Wide delivery — extra run added. Ball to be replayed.'
244
+ [model.raw] {"reflect_after_ball": {"reflection": "The wide was a lapse in control. I need to tighten up the line and ensure the outswinger is hitting the target without straying too far down the leg side or too wide to be easily punished. I'll stick to the plan but focus on execution accuracy."}}
245
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'reflect_after_ball', 'arguments': {'reflection': "The wide was a lapse in control. I need to tighten up the line and ensure the outswinger is hitting the target without straying too far down the leg side or too wide to be easily punished. I'll stick to the plan but focus on execution accuracy."}}
246
+ [openenv] step(action)
247
+ [openenv] step -> done=False reward=0.005
248
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 3, 'score': 1, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'boundary', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
249
+ [after] last_ball='Reflection recorded.'
250
+
251
+ --- turn 26 ---
252
+ [before] game_state=bowling strategic_phase=pre_ball last_ball='Reflection recorded.'
253
+ [model.raw] {"plan_delivery": {"bowler_type": "Pace", "line": "Outside off stump", "length": "Full", "delivery_type": "Outswinger", "rationale": "Need to regain control after the wide. I will stick to the outswinger plan but focus on a tighter line just outside off stump to force the batter to play at it without providing width for easy scoring."}}
254
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'plan_delivery', 'arguments': {'bowler_type': 'Pace', 'line': 'Outside off stump', 'length': 'Full', 'delivery_type': 'Outswinger', 'rationale': 'Need to regain control after the wide. I will stick to the outswinger plan but focus on a tighter line just outside off stump to force the batter to play at it without providing width for easy scoring.'}}
255
+ [openenv] step(action)
256
+ [openenv] step -> done=False reward=0.000
257
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 3, 'score': 1, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'ball_resolution', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'boundary', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
258
+ [after] last_ball='Delivery plan set: Outswinger.'
259
+
260
+ --- turn 27 ---
261
+ [before] game_state=bowling strategic_phase=ball_resolution last_ball='Delivery plan set: Outswinger.'
262
+ [model.raw] {"bowl_delivery": {}}
263
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'bowl_delivery', 'arguments': {}}
264
+ [openenv] step(action)
265
+ [openenv] step -> done=False reward=0.006
266
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 4, 'score': 2, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'post_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'boundary', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
267
+ [after] last_ball='Driven through the covers — a single!'
268
+
269
+ --- turn 28 ---
270
+ [before] game_state=bowling strategic_phase=post_ball last_ball='Driven through the covers — a single!'
271
+ [model.raw] {"reflect_after_ball": {"reflection": "The batter is using the pace and finding the gaps easily. The single through covers shows they are comfortable playing the full length. I need to vary the length slightly or tighten the line even more to prevent them from getting into a rhythm and milking the singles."}}
272
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'reflect_after_ball', 'arguments': {'reflection': 'The batter is using the pace and finding the gaps easily. The single through covers shows they are comfortable playing the full length. I need to vary the length slightly or tighten the line even more to prevent them from getting into a rhythm and milking the singles.'}}
273
+ [openenv] step(action)
274
+ [openenv] step -> done=False reward=0.005
275
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 4, 'score': 2, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'boundary', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
276
+ [after] last_ball='Reflection recorded.'
277
+
278
+ --- turn 29 ---
279
+ [before] game_state=bowling strategic_phase=pre_ball last_ball='Reflection recorded.'
280
+ [model.raw] {"plan_delivery": {"bowler_type": "Pace", "line": "On off stump", "length": "Good length", "delivery_type": "Back of a length cutter", "rationale": "The batter is finding the covers easily with full deliveries. I'll change the pace and length to a back-of-a-length cutter on the off stump to disrupt their rhythm and prevent them from driving through the covers."}}
281
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'plan_delivery', 'arguments': {'bowler_type': 'Pace', 'line': 'On off stump', 'length': 'Good length', 'delivery_type': 'Back of a length cutter', 'rationale': "The batter is finding the covers easily with full deliveries. I'll change the pace and length to a back-of-a-length cutter on the off stump to disrupt their rhythm and prevent them from driving through the covers."}}
282
+ [openenv] step(action)
283
+ [openenv] step -> done=False reward=0.000
284
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 4, 'score': 2, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'ball_resolution', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'boundary', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
285
+ [after] last_ball='Delivery plan set: Back of a length cutter.'
286
+
287
+ --- turn 30 ---
288
+ [before] game_state=bowling strategic_phase=ball_resolution last_ball='Delivery plan set: Back of a length cutter.'
289
+ [model.raw] {"bowl_delivery": {}}
290
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'bowl_delivery', 'arguments': {}}
291
+ [openenv] step(action)
292
+ [openenv] step -> done=False reward=0.007
293
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 5, 'score': 6, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'post_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'rotate', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
294
+ [after] last_ball='Worked off the hips — a FOUR.'
295
+
296
+ --- turn 31 ---
297
+ [before] game_state=bowling strategic_phase=post_ball last_ball='Worked off the hips — a FOUR.'
298
+ [model.raw] {"reflect_after_ball": {"reflection": "The cutter was misjudged or played too easily off the hips, resulting in a boundary. The batter is clearly looking to punish anything that isn't perfectly executed. I need to return to a more standard length or perhaps a quicker delivery to prevent them from playing them off the hips so easily. I'll try to bring the line back slightly wider to force them to reach, or go for a quicker, fuller delivery to see if I can get them playing late."}}
299
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'reflect_after_ball', 'arguments': {'reflection': "The cutter was misjudged or played too easily off the hips, resulting in a boundary. The batter is clearly looking to punish anything that isn't perfectly executed. I need to return to a more standard length or perhaps a quicker delivery to prevent them from playing them off the hips so easily. I'll try to bring the line back slightly wider to force them to reach, or go for a quicker, fuller delivery to see if I can get them playing late."}}
300
+ [openenv] step(action)
301
+ [openenv] step -> done=False reward=0.005
302
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 5, 'score': 6, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'rotate', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
303
+ [after] last_ball='Reflection recorded.'
304
+
305
+ --- turn 32 ---
306
+ [before] game_state=bowling strategic_phase=pre_ball last_ball='Reflection recorded.'
307
+ [model.raw] {"plan_delivery": {"bowler_type": "Pace", "line": "Wide of off stump", "length": "Full", "delivery_type": "Fast yorker length", "rationale": "The batter just hit a boundary off the cutter. I need to stop them from playing off the hips. By moving the line wider and aiming for a fuller, quicker delivery, I'll force them to reach and prevent them from using the pace or the cutter's grip to work it through the leg side or covers."}}
308
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'plan_delivery', 'arguments': {'bowler_type': 'Pace', 'line': 'Wide of off stump', 'length': 'Full', 'delivery_type': 'Fast yorker length', 'rationale': "The batter just hit a boundary off the cutter. I need to stop them from playing off the hips. By moving the line wider and aiming for a fuller, quicker delivery, I'll force them to reach and prevent them from using the pace or the cutter's grip to work it through the leg side or covers."}}
309
+ [openenv] step(action)
310
+ [openenv] step -> done=False reward=0.000
311
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 1, 'ball': 5, 'score': 6, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'ball_resolution', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'rotate', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
312
+ [after] last_ball='Delivery plan set: Fast yorker length.'
313
+
314
+ --- turn 33 ---
315
+ [before] game_state=bowling strategic_phase=ball_resolution last_ball='Delivery plan set: Fast yorker length.'
316
+ [model.raw] {"bowl_delivery": {}}
317
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'bowl_delivery', 'arguments': {}}
318
+ [openenv] step(action)
319
+ [openenv] step -> done=False reward=0.007
320
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 2, 'ball': 0, 'score': 6, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'post_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'boundary', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
321
+ [after] last_ball='Driven through the covers — dot ball!'
322
+
323
+ --- turn 34 ---
324
+ [before] game_state=bowling strategic_phase=post_ball last_ball='Driven through the covers — dot ball!'
325
+ [model.raw] {"reflect_after_ball": {"reflection": "The full, wide delivery worked well to prevent the boundary and resulted in a dot ball. The batter was forced to reach, which disrupted their ability to play off the hips. I have successfully regained control. I will maintain this pressure by sticking to a similar line and length, perhaps slightly more attacking to see if I can induce an edge or a mistake while they are adjusting to the width."}}
326
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'reflect_after_ball', 'arguments': {'reflection': 'The full, wide delivery worked well to prevent the boundary and resulted in a dot ball. The batter was forced to reach, which disrupted their ability to play off the hips. I have successfully regained control. I will maintain this pressure by sticking to a similar line and length, perhaps slightly more attacking to see if I can induce an edge or a mistake while they are adjusting to the width.'}}
327
+ [openenv] step(action)
328
+ [openenv] step -> done=False reward=0.007
329
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 2, 'ball': 0, 'score': 6, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'boundary', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
330
+ [after] last_ball='Reflection recorded.'
331
+
332
+ --- turn 35 ---
333
+ [before] game_state=bowling strategic_phase=pre_ball last_ball='Reflection recorded.'
334
+ [model.raw] {"plan_delivery": {"bowler_type": "Pace", "line": "Wide of off stump", "length": "Full", "delivery_type": "Fast yorker length", "rationale": "The previous delivery was a dot ball, successfully forcing the batter to reach. I will stick to this plan to maintain pressure and exploit the width, aiming to keep the batter from settling into a rhythm."}}
335
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'plan_delivery', 'arguments': {'bowler_type': 'Pace', 'line': 'Wide of off stump', 'length': 'Full', 'delivery_type': 'Fast yorker length', 'rationale': 'The previous delivery was a dot ball, successfully forcing the batter to reach. I will stick to this plan to maintain pressure and exploit the width, aiming to keep the batter from settling into a rhythm.'}}
336
+ [openenv] step(action)
337
+ [openenv] step -> done=False reward=0.000
338
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 2, 'ball': 0, 'score': 6, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'ball_resolution', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'boundary', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
339
+ [after] last_ball='Delivery plan set: Fast yorker length.'
340
+
341
+ --- turn 36 ---
342
+ [before] game_state=bowling strategic_phase=ball_resolution last_ball='Delivery plan set: Fast yorker length.'
343
+ [model.raw] {"bowl_delivery": {}}
344
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'bowl_delivery', 'arguments': {}}
345
+ [openenv] step(action)
346
+ [openenv] step -> done=False reward=0.006
347
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 2, 'ball': 1, 'score': 6, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'post_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'six', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
348
+ [after] last_ball='Launched over long-on — dot ball!'
349
+
350
+ --- turn 37 ---
351
+ [before] game_state=bowling strategic_phase=post_ball last_ball='Launched over long-on — dot ball!'
352
+ [model.raw] {"reflect_after_ball": {"reflection": "The batter attempted a high-risk launch over long-on but failed to connect for a boundary, resulting in a dot ball. This shows they are trying to force the pace against my wide line. While I've kept them from scoring, the intent to attack is high. I need to stay disciplined with my line to prevent them from getting under the ball, but I might consider slightly varying the length to a good length to see if they're over-committing to the drive/loft."}}
353
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'reflect_after_ball', 'arguments': {'reflection': "The batter attempted a high-risk launch over long-on but failed to connect for a boundary, resulting in a dot ball. This shows they are trying to force the pace against my wide line. While I've kept them from scoring, the intent to attack is high. I need to stay disciplined with my line to prevent them from getting under the ball, but I might consider slightly varying the length to a good length to see if they're over-committing to the drive/loft."}}
354
+ [openenv] step(action)
355
+ [openenv] step -> done=False reward=0.007
356
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 2, 'ball': 1, 'score': 6, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'six', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
357
+ [after] last_ball='Reflection recorded.'
358
+
359
+ --- turn 38 ---
360
+ [before] game_state=bowling strategic_phase=pre_ball last_ball='Reflection recorded.'
361
+ [model.raw] {"bowl_delivery": {}}
362
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'bowl_delivery', 'arguments': {}}
363
+ [openenv] step(action)
364
+ [openenv] step -> done=False reward=0.006
365
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 2, 'ball': 2, 'score': 7, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'post_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'six', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
366
+ [after] last_ball='Launched over long-on — a single!'
367
+
368
+ --- turn 39 ---
369
+ [before] game_state=bowling strategic_phase=post_ball last_ball='Launched over long-on — a single!'
370
+ [model.raw] {"reflect_after_ball": {"reflection": "The batter managed to get a single by launching the ball over long-on, breaking the dot ball pressure. They are clearly looking to exploit the width and power through the air. I need to adjust my length slightly to avoid being too predictable or too full, which allows them to get under the ball. I'll move from a yorker length to a slightly fuller 'good length' to make it harder to loft, while maintaining the wide line to keep them away from their hitting zone."}}
371
+ [parser] parse_error=False action={'metadata': {}, 'tool': 'reflect_after_ball', 'arguments': {'reflection': "The batter managed to get a single by launching the ball over long-on, breaking the dot ball pressure. They are clearly looking to exploit the width and power through the air. I need to adjust my length slightly to avoid being too predictable or too full, which allows them to get under the ball. I'll move from a yorker length to a slightly fuller 'good length' to make it harder to loft, while maintaining the wide line to keep them away from their hitting zone."}}
372
+ [openenv] step(action)
373
+ [openenv] step -> done=False reward=0.007
374
+ [after] score_context={'game_state': 'bowling', 'innings': 'first', 'over': 2, 'ball': 2, 'score': 7, 'wickets': 0, 'target': None, 'phase': 'powerplay', 'bowler_type': 'Pace', 'field_setting': 'Aggressive', 'strategic_phase': 'pre_ball', 'current_batter': {'name': 'Opener', 'style': 'balanced', 'aggression': 0.55}, 'current_bowler': {'name': 'Strike Pacer', 'type': 'pace', 'style': 'swing'}, 'opponent_plan': {'shot_intent': 'six', 'aggression': 0.7, 'rationale': 'Opponent bats with powerplay phase risk and 0 wickets down.'}, 'eval_pack_id': 'adaptive_t20_v1'}
375
+ [after] last_ball='Reflection recorded.'
376
+
377
+ [openenv] state()
378
+ [final] score=7/0 over=2.2
379
+ [final] game_state=bowling target=None tool_calls=40
380
+ [final] rewards_sum=0.168
381
+ [final] mean_coherence=0.657
382
+ [final] mean_adaptation=0.502
383
+ [final] mean_opponent_awareness=0.750
384
+ [final] parse_error_rate=0.0%
385
+ [final] transcript_events=85
illustrations/exp_2026-04-25_5over_random_llm_cached/README.md ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Experiment: 5-over sanity check (random captain, cached opponent)
2
+
3
+ ### Goal
4
+
5
+ Verify end-to-end wiring works (server ↔ runner ↔ tools/metrics) for a short innings.
6
+
7
+ ### Setup
8
+
9
+ - **Server**: must be reachable at `ws://localhost:8001/ws` (or set `CRICKET_CAPTAIN_ENV_URL`)
10
+ - **Eval pack**: `adaptive_t20_v1`
11
+ - **Opponent mode**: `llm_cached`
12
+
13
+ ### Run
14
+
15
+ From `cricket_captain/`:
16
+
17
+ ```bash
18
+ export CRICKET_CAPTAIN_ENV_URL="ws://localhost:8001"
19
+ python inference.py \
20
+ --model random \
21
+ --episodes 2 \
22
+ --task stage2_full \
23
+ --max-overs 5 \
24
+ --env-url "$CRICKET_CAPTAIN_ENV_URL" \
25
+ --eval-pack-id adaptive_t20_v1 \
26
+ --opponent-mode llm_cached
27
+ ```
28
+
29
+ ### Results
30
+
31
+ See `run_output.txt`.
32
+
illustrations/exp_2026-04-25_5over_random_llm_cached/run_output.txt ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ Using RandomAgent baseline
2
+ Episode 1/2 | Score: 14/2 (5 ov) | Reward: 0.817 | Coherence: 0.548 | Adapt: 0.615 | ParseErr: 0.0%
3
+ Episode 2/2 | Score: 13/0 (3 ov) | Reward: 1.151 | Coherence: 0.562 | Adapt: 0.580 | ParseErr: 0.0%
4
+
5
+ === Summary ===
6
+ total_score : mean=13.500 std=0.707
7
+ wickets_lost : mean=1.000 std=1.414
8
+ total_reward : mean=0.984 std=0.236
9
+ mean_coherence : mean=0.555 std=0.010
10
+ parse_error_rate : mean=0.000 std=0.000
inference.py CHANGED
@@ -64,6 +64,46 @@ Always respond with exactly one JSON object on a single line, no markdown."""
64
  SHOT_AGGRESSION_ORDER = ["leave", "defensive", "single", "rotate", "boundary", "six"]
65
 
66
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
  class RandomAgent:
68
  """Baseline: random valid tool calls based on availability."""
69
 
@@ -191,8 +231,8 @@ class OpenAIAgent:
191
  resp = self._client.chat.completions.create(
192
  model=self._model,
193
  messages=messages,
194
- temperature=0.7,
195
- max_tokens=200,
196
  )
197
  return resp.choices[0].message.content.strip()
198
 
@@ -203,16 +243,24 @@ def _parse_action(raw: str) -> tuple[CricketAction | None, bool]:
203
  lines = raw.split("\n")
204
  raw = "\n".join(lines[1:-1]) if len(lines) > 2 else raw
205
  try:
206
- data = json.loads(raw)
207
- tool = data.get("tool", "")
 
 
 
208
  valid_tools = (
209
  "set_strategy", "analyze_situation", "play_delivery",
210
  "call_toss", "bowl_delivery", "set_bowling_strategy", "set_field_setting",
211
  "choose_bowler", "select_batter", "plan_delivery", "plan_shot", "reflect_after_ball"
212
  )
 
 
 
 
 
213
  if tool not in valid_tools:
214
  return None, True
215
- return CricketAction(tool=tool, arguments=data.get("arguments", {})), False
216
  except Exception:
217
  return None, True
218
 
 
64
  SHOT_AGGRESSION_ORDER = ["leave", "defensive", "single", "rotate", "boundary", "six"]
65
 
66
 
67
+ def _coerce_aggression(value: Any, default: float = 0.5) -> float:
68
+ if isinstance(value, (int, float)):
69
+ return max(0.0, min(1.0, float(value)))
70
+ text = str(value).strip().lower()
71
+ word_map = {
72
+ "very low": 0.15,
73
+ "low": 0.25,
74
+ "conservative": 0.25,
75
+ "defensive": 0.25,
76
+ "moderate": 0.5,
77
+ "medium": 0.5,
78
+ "balanced": 0.5,
79
+ "normal": 0.5,
80
+ "high": 0.75,
81
+ "aggressive": 0.75,
82
+ "very high": 0.9,
83
+ "attack": 0.8,
84
+ "attacking": 0.8,
85
+ }
86
+ try:
87
+ return max(0.0, min(1.0, float(text)))
88
+ except ValueError:
89
+ return word_map.get(text, default)
90
+
91
+
92
+ def _normalize_action_args(tool: str, args: dict[str, Any]) -> dict[str, Any]:
93
+ """Normalize common LLM variants before sending to the server."""
94
+ normalized = dict(args)
95
+ if tool in ("set_strategy", "select_batter") and "aggression" in normalized:
96
+ normalized["aggression"] = _coerce_aggression(normalized["aggression"])
97
+ if tool == "plan_shot" and str(normalized.get("risk", "")).lower() == "moderate":
98
+ normalized["risk"] = "balanced"
99
+ if tool == "call_toss":
100
+ call = str(normalized.get("call", "heads")).lower()
101
+ decision = str(normalized.get("decision", "bat")).lower()
102
+ normalized["call"] = call if call in ("heads", "tails") else "heads"
103
+ normalized["decision"] = decision if decision in ("bat", "bowl") else "bat"
104
+ return normalized
105
+
106
+
107
  class RandomAgent:
108
  """Baseline: random valid tool calls based on availability."""
109
 
 
231
  resp = self._client.chat.completions.create(
232
  model=self._model,
233
  messages=messages,
234
+ temperature=0.2,
235
+ max_tokens=300,
236
  )
237
  return resp.choices[0].message.content.strip()
238
 
 
243
  lines = raw.split("\n")
244
  raw = "\n".join(lines[1:-1]) if len(lines) > 2 else raw
245
  try:
246
+ if not raw.startswith("{"):
247
+ start = raw.find("{")
248
+ if start >= 0:
249
+ raw = raw[start:]
250
+ data, _ = json.JSONDecoder().raw_decode(raw)
251
  valid_tools = (
252
  "set_strategy", "analyze_situation", "play_delivery",
253
  "call_toss", "bowl_delivery", "set_bowling_strategy", "set_field_setting",
254
  "choose_bowler", "select_batter", "plan_delivery", "plan_shot", "reflect_after_ball"
255
  )
256
+ if "tool" not in data and len(data) == 1:
257
+ maybe_tool, maybe_args = next(iter(data.items()))
258
+ if maybe_tool in valid_tools and isinstance(maybe_args, dict):
259
+ data = {"tool": maybe_tool, "arguments": maybe_args}
260
+ tool = data.get("tool", "")
261
  if tool not in valid_tools:
262
  return None, True
263
+ return CricketAction(tool=tool, arguments=_normalize_action_args(tool, data.get("arguments", {}))), False
264
  except Exception:
265
  return None, True
266
 
server/opponent_policy.py CHANGED
@@ -164,7 +164,7 @@ class LLMOpponentPolicy:
164
  self._fallback = fallback
165
  # Default to an HF-hosted model via HF router unless overridden.
166
  # This mirrors the OpenAI-compatible Hugging Face router pattern used by other OpenEnv submissions.
167
- self._model = model or os.environ.get("CRICKET_OPPONENT_MODEL", "google/gemma-2-2b-it")
168
  self._temperature = temperature
169
  try:
170
  import openai
 
164
  self._fallback = fallback
165
  # Default to an HF-hosted model via HF router unless overridden.
166
  # This mirrors the OpenAI-compatible Hugging Face router pattern used by other OpenEnv submissions.
167
+ self._model = model or os.environ.get("CRICKET_OPPONENT_MODEL", "google/gemma-4-26B-A4B-it")
168
  self._temperature = temperature
169
  try:
170
  import openai