pratinavseth Claude Sonnet 4.6 commited on
Commit
ae081e6
·
1 Parent(s): b9ce72f

feat: CricsheetOpponentPolicy — real match data as opponent decisions

Browse files

Adds a new opponent mode `cricsheet` that samples real Cricsheet ball-by-ball
deliveries as opponent batting/bowling plans. Deliveries are indexed by
(phase, wickets_band, innings_type) for context-aware sampling with progressive
fallback widening (drop innings_type → drop wickets band) when buckets are thin.

- server/opponent_policy.py: CricsheetOpponentPolicy class; runs_batter →
shot_intent mapping; uses format_mapper for bowling line/length; heuristic
fallback for reflect_after_ball; create_opponent_policy accepts cricsheet mode
+ cricsheet_data_path kwarg
- server/cricket_environment.py: pass cricsheet_data_path from options / env var
CRICKET_CRICSHEET_DATA through to create_opponent_policy
- inference.py / train.py: add "cricsheet" to --opponent-mode choices
- illustrations: add 5-over cricsheet baseline run (coherence=0.572, 0% parse errors)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

illustrations/exp_2026-04-25_12-29_inference_5ov_cricsheet_random/README.md ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Inference Run: exp_2026-04-25_12-29_inference_5ov_cricsheet_random
2
+
3
+ **Date**: 2026-04-25 12:29
4
+
5
+ | Setting | Value |
6
+ |---|---|
7
+ | Model | `random` |
8
+ | API base | `N/A` |
9
+ | Opponent mode | `cricsheet` |
10
+ | Max overs | 5 |
11
+ | Episodes | 3 |
12
+ | Task | `stage2_full` |
13
+
14
+ ### Results
15
+
16
+ ```
17
+ total_score : mean=10.667 std=1.528
18
+ wickets_lost : mean=1.667 std=0.577
19
+ total_reward : mean=-0.789 std=1.041
20
+ mean_coherence : mean=0.572 std=0.063
21
+ parse_error_rate : mean=0.000 std=0.000
22
+ ```
23
+
24
+ See `run_output.txt` for full verbose episode log.
illustrations/exp_2026-04-25_12-29_inference_5ov_cricsheet_random/run_output.txt ADDED
@@ -0,0 +1,181 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Inference run: exp_2026-04-25_12-29_inference_5ov_cricsheet_random
2
+ timestamp_utc: 2026-04-25T12:29:13.411706
3
+ model: random
4
+ api_base: None
5
+ opponent_mode: cricsheet
6
+ max_overs: 5
7
+ episodes: 3
8
+ task: stage2_full
9
+ eval_pack_id: default
10
+
11
+ --- Episode 1/3 ---
12
+ over=0.1 score=0/0 tool=play_delivery r=0.000 | Driven through the covers — dot ball!
13
+ over=0.2 score=3/0 tool=play_delivery r=0.032 | Played toward midwicket; misfield — three runs.
14
+ over=0.3 score=3/1 tool=play_delivery r=-0.097 | Lofted toward cover — fielder settles under it. OUT!
15
+ over=0.3 score=4/1 tool=play_delivery r=0.020 | No-ball called — extra run added and the ball must be replay
16
+ over=0.4 score=5/1 tool=play_delivery r=0.013 | Driven through the covers — a single!
17
+ over=0.5 score=5/1 tool=play_delivery r=0.003 | Nudged into the gap — dot ball.
18
+ over=1.0 score=5/1 tool=play_delivery r=-0.047 | Edge runs toward straight — dot ball.
19
+ over=1.1 score=6/2 tool=play_delivery r=-0.088 | A thin edge carries behind the wicket — taken cleanly. OUT!
20
+ over=1.2 score=6/2 tool=play_delivery r=0.003 | Worked off the hips — dot ball.
21
+ over=1.3 score=7/2 tool=play_delivery r=0.013 | Launched over long-on — a single!
22
+ over=1.4 score=8/2 tool=play_delivery r=0.013 | Nudged into the gap — a single.
23
+ over=1.4 score=9/2 tool=play_delivery r=0.020 | Wide delivery — extra run added. Ball to be replayed.
24
+ over=1.5 score=9/2 tool=play_delivery r=0.003 | Driven through the covers — dot ball!
25
+ over=2.0 score=9/2 tool=play_delivery r=0.002 | Launched over long-on — dot ball!
26
+ over=2.0 score=10/2 tool=play_delivery r=0.020 | Wide delivery — extra run added. Ball to be replayed.
27
+ over=2.1 score=10/2 tool=play_delivery r=0.004 | Worked off the hips — dot ball.
28
+ over=2.2 score=11/2 tool=play_delivery r=0.013 | Nudged into the gap — a single.
29
+ over=2.2 score=12/2 tool=play_delivery r=0.020 | Wide delivery — extra run added. Ball to be replayed.
30
+ over=2.3 score=13/2 tool=play_delivery r=0.013 | Played toward midwicket; gap in midwicket allows a quick sin
31
+ over=2.4 score=13/2 tool=play_delivery r=0.003 | Defended solidly — dot ball.
32
+ over=2.5 score=13/3 tool=play_delivery r=-0.097 | A thin edge carries behind the wicket — taken cleanly. OUT!
33
+ over=3.0 score=13/3 tool=play_delivery r=-0.147 | Left outside off — dot ball.
34
+ over=3.1 score=19/3 tool=play_delivery r=0.082 | Left outside off — a SIX.
35
+ over=3.2 score=23/3 tool=play_delivery r=0.063 | Nudged into the gap — a FOUR.
36
+ over=3.3 score=23/3 tool=play_delivery r=0.002 | Defended solidly — dot ball.
37
+ over=3.4 score=23/3 tool=play_delivery r=0.003 | Played toward cover; inner fielder at cover saves one — dot
38
+ over=3.5 score=25/3 tool=play_delivery r=0.023 | Driven through the covers — two runs!
39
+ over=4.0 score=26/3 tool=play_delivery r=-0.137 | Defended solidly — a single.
40
+ over=4.1 score=26/3 tool=play_delivery r=0.003 | Defended solidly — dot ball.
41
+ over=4.1 score=27/3 tool=play_delivery r=0.020 | Wide delivery — extra run added. Ball to be replayed.
42
+ over=4.1 score=28/3 tool=play_delivery r=0.020 | Wide delivery — extra run added. Ball to be replayed.
43
+ over=4.2 score=29/3 tool=play_delivery r=0.013 | Nudged into the gap — a single.
44
+ over=4.3 score=29/3 tool=play_delivery r=0.003 | Defended solidly — dot ball.
45
+ over=4.4 score=32/3 tool=play_delivery r=0.033 | Driven through the covers — three runs!
46
+ over=4.5 score=32/3 tool=play_delivery r=0.003 | Left outside off — dot ball.
47
+ over=4.5 score=33/3 tool=play_delivery r=0.020 | Wide delivery — extra run added. Ball to be replayed.
48
+ over=0.0 score=0/0 tool=play_delivery r=-0.060 | Nudged into the gap — dot ball. Innings over. First innings
49
+ over=0.1 score=0/0 tool=bowl_delivery r=0.024 | Nudged into the gap — dot ball.
50
+ over=0.2 score=0/0 tool=bowl_delivery r=0.024 | Launched over long-on — dot ball!
51
+ over=0.3 score=0/0 tool=bowl_delivery r=0.024 | Launched over long-on — dot ball!
52
+ over=0.4 score=0/0 tool=bowl_delivery r=0.024 | Launched over long-on — dot ball!
53
+ over=0.5 score=0/1 tool=bowl_delivery r=0.144 | Pushed into midwicket; sharp fielding creates a run-out. OUT
54
+ over=1.0 score=1/1 tool=bowl_delivery r=-0.058 | Played toward midwicket; gap in midwicket allows a quick sin
55
+ over=1.1 score=1/1 tool=bowl_delivery r=0.022 | Defended solidly — dot ball.
56
+ over=1.2 score=3/1 tool=bowl_delivery r=-0.018 | Worked off the hips — two runs.
57
+ over=1.3 score=3/1 tool=bowl_delivery r=0.022 | Defended solidly — dot ball.
58
+ over=1.3 score=4/1 tool=bowl_delivery r=-0.040 | Wide delivery — extra run added. Ball to be replayed.
59
+ over=1.4 score=4/2 tool=bowl_delivery r=0.143 | Pushed into midwicket; sharp fielding creates a run-out. OUT
60
+ over=1.5 score=5/2 tool=bowl_delivery r=-0.006 | Driven through the covers — a single!
61
+ over=2.0 score=5/2 tool=bowl_delivery r=-0.076 | Nudged into the gap — dot ball.
62
+ over=2.0 score=6/2 tool=bowl_delivery r=-0.040 | Wide delivery — extra run added. Ball to be replayed.
63
+ over=2.1 score=6/2 tool=bowl_delivery r=0.024 | Worked off the hips — dot ball.
64
+ over=2.2 score=6/2 tool=bowl_delivery r=0.024 | Driven through the covers — dot ball!
65
+ over=2.3 score=6/2 tool=bowl_delivery r=0.023 | Worked off the hips — dot ball.
66
+ over=2.4 score=6/2 tool=bowl_delivery r=0.023 | Driven through the covers — dot ball!
67
+ over=2.4 score=7/2 tool=bowl_delivery r=-0.040 | Wide delivery — extra run added. Ball to be replayed.
68
+ over=2.5 score=7/2 tool=bowl_delivery r=0.024 | Launched over long-on — dot ball!
69
+ over=3.0 score=7/2 tool=bowl_delivery r=-0.126 | Worked off the hips — dot ball.
70
+ over=3.1 score=7/2 tool=bowl_delivery r=0.024 | Played toward cover; inner fielder at cover saves one — dot
71
+ over=3.2 score=7/2 tool=bowl_delivery r=0.024 | Launched over long-on — dot ball!
72
+ over=3.3 score=7/2 tool=bowl_delivery r=0.024 | Worked off the hips — dot ball.
73
+ over=3.4 score=8/2 tool=bowl_delivery r=-0.006 | Played toward midwicket; misfield — a single.
74
+ over=3.5 score=8/2 tool=bowl_delivery r=0.023 | Worked off the hips — dot ball.
75
+ over=4.0 score=9/2 tool=bowl_delivery r=-0.157 | Launched over long-on — a single!
76
+ over=4.1 score=9/2 tool=bowl_delivery r=0.023 | Played toward cover; inner fielder at cover saves one — dot
77
+ over=4.2 score=9/2 tool=bowl_delivery r=0.023 | Driven through the covers — dot ball!
78
+ over=4.3 score=9/2 tool=bowl_delivery r=0.023 | Driven through the covers — dot ball!
79
+ over=4.4 score=10/2 tool=bowl_delivery r=-0.006 | Launched over long-on — a single!
80
+ over=4.4 score=11/2 tool=bowl_delivery r=-0.040 | Wide delivery — extra run added. Ball to be replayed.
81
+ over=4.4 score=12/2 tool=bowl_delivery r=-0.040 | Wide delivery — extra run added. Ball to be replayed.
82
+ over=4.5 score=12/2 tool=bowl_delivery r=0.024 | Driven through the covers — dot ball!
83
+ over=5.0 score=12/2 tool=bowl_delivery r=1.213 | Driven through the covers — dot ball! Match over. Result: WI
84
+ Episode 1/3 | Score: 12/2 (5 ov) | Reward: -1.748 | Coherence: 0.606 | Adapt: 0.614 | ParseErr: 0.0%
85
+
86
+ --- Episode 2/3 ---
87
+ over=0.1 score=1/0 tool=bowl_delivery r=-0.010 | Nudged into the gap — a single.
88
+ over=0.2 score=1/0 tool=bowl_delivery r=0.023 | Defended solidly — dot ball.
89
+ over=0.2 score=2/0 tool=bowl_delivery r=-0.040 | Wide delivery — extra run added. Ball to be replayed.
90
+ over=0.3 score=2/0 tool=bowl_delivery r=0.023 | Driven through the covers — dot ball!
91
+ over=0.4 score=2/0 tool=bowl_delivery r=0.023 | Driven through the covers — dot ball!
92
+ over=0.5 score=3/0 tool=bowl_delivery r=-0.007 | Played toward cover; deep fielder at deep_cover cuts off bou
93
+ over=1.0 score=4/0 tool=bowl_delivery r=-0.056 | Defended solidly — a single.
94
+ over=1.1 score=6/0 tool=bowl_delivery r=-0.016 | Launched over long-on — two runs!
95
+ over=1.2 score=7/0 tool=bowl_delivery r=-0.007 | Played toward point; inner fielder at point saves one — a si
96
+ over=1.3 score=7/0 tool=bowl_delivery r=0.023 | Edge runs toward midwicket — dot ball.
97
+ over=1.4 score=8/0 tool=bowl_delivery r=-0.007 | Defended solidly — a single.
98
+ over=1.5 score=8/1 tool=bowl_delivery r=0.144 | Yorker at the stumps beats the swing and crashes into the ba
99
+ over=2.0 score=8/1 tool=bowl_delivery r=0.024 | Driven through the covers — dot ball!
100
+ over=2.1 score=8/1 tool=bowl_delivery r=0.024 | Nudged into the gap — dot ball.
101
+ over=2.2 score=8/1 tool=bowl_delivery r=0.024 | Worked off the hips — dot ball.
102
+ over=2.3 score=8/1 tool=bowl_delivery r=0.024 | Nudged into the gap — dot ball.
103
+ over=2.4 score=8/1 tool=bowl_delivery r=0.024 | Driven through the covers — dot ball!
104
+ over=2.5 score=8/2 tool=bowl_delivery r=0.144 | Yorker at the stumps beats the swing and crashes into the ba
105
+ over=3.0 score=9/3 tool=bowl_delivery r=-0.037 | Lofted toward long_on — fielder settles under it. OUT!
106
+ over=3.1 score=9/3 tool=bowl_delivery r=0.023 | Edge runs toward midwicket — dot ball.
107
+ over=3.2 score=9/3 tool=bowl_delivery r=0.023 | Launched over long-on — dot ball!
108
+ over=3.3 score=9/3 tool=bowl_delivery r=0.023 | Launched over long-on — dot ball!
109
+ over=3.4 score=9/3 tool=bowl_delivery r=0.022 | Nudged into the gap — dot ball.
110
+ over=3.5 score=9/4 tool=bowl_delivery r=0.143 | Lofted toward long_on — fielder settles under it. OUT!
111
+ over=4.0 score=9/4 tool=bowl_delivery r=-0.127 | Defended solidly — dot ball.
112
+ over=4.1 score=9/4 tool=bowl_delivery r=0.023 | Driven through the covers — dot ball!
113
+ over=4.2 score=9/4 tool=bowl_delivery r=0.024 | Launched over long-on — dot ball!
114
+ over=4.3 score=10/4 tool=bowl_delivery r=-0.008 | Worked off the hips — a single.
115
+ over=4.4 score=10/4 tool=bowl_delivery r=0.023 | Driven through the covers — dot ball!
116
+ over=4.5 score=10/4 tool=bowl_delivery r=0.024 | Driven through the covers — dot ball!
117
+ over=0.0 score=0/0 tool=bowl_delivery r=0.250 | Worked off the hips — dot ball. Innings over. First innings
118
+ over=0.1 score=0/0 tool=play_delivery r=0.003 | Launched over long-on — dot ball!
119
+ over=0.2 score=4/0 tool=play_delivery r=0.063 | Launched over long-on — a FOUR!
120
+ over=0.3 score=4/0 tool=play_delivery r=0.003 | Launched over long-on — dot ball!
121
+ over=0.4 score=4/1 tool=play_delivery r=-0.097 | A thin edge carries behind the wicket — taken cleanly. OUT!
122
+ over=0.5 score=4/1 tool=play_delivery r=0.003 | Driven through the covers — dot ball!
123
+ over=0.5 score=5/1 tool=play_delivery r=0.020 | No-ball called — extra run added and the ball must be replay
124
+ over=1.0 score=9/1 tool=play_delivery r=0.013 | Launched over long-on — a FOUR!
125
+ over=1.1 score=9/1 tool=play_delivery r=0.003 | Launched over long-on — dot ball!
126
+ over=1.2 score=10/1 tool=play_delivery r=0.013 | Driven through the covers — a single!
127
+ over=1.3 score=10/1 tool=play_delivery r=0.003 | Launched over long-on — dot ball!
128
+ over=1.4 score=11/2 tool=play_delivery r=1.279 | Turned to leg — sharp catch at short fine leg. OUT! Match ov
129
+ Episode 2/3 | Score: 11/2 (1 ov) | Reward: 0.318 | Coherence: 0.612 | Adapt: 0.617 | ParseErr: 0.0%
130
+
131
+ --- Episode 3/3 ---
132
+ over=0.1 score=0/0 tool=play_delivery r=0.000 | Worked off the hips — dot ball.
133
+ over=0.2 score=0/0 tool=play_delivery r=0.000 | Nudged into the gap — dot ball.
134
+ over=0.2 score=1/0 tool=play_delivery r=0.020 | Wide delivery — extra run added. Ball to be replayed.
135
+ over=0.3 score=1/0 tool=play_delivery r=0.000 | Edge runs toward midwicket — dot ball.
136
+ over=0.4 score=1/0 tool=play_delivery r=0.003 | Driven through the covers — dot ball!
137
+ over=0.5 score=1/0 tool=play_delivery r=0.002 | Nudged into the gap — dot ball.
138
+ over=1.0 score=1/1 tool=play_delivery r=-0.147 | Pushed at it — inside edge onto stumps. OUT!
139
+ over=1.1 score=1/1 tool=play_delivery r=0.001 | Launched over long-on — dot ball!
140
+ over=1.2 score=1/1 tool=play_delivery r=0.003 | Defended solidly — dot ball.
141
+ over=1.3 score=2/1 tool=play_delivery r=0.011 | Played toward midwicket; gap in midwicket allows a quick sin
142
+ over=1.4 score=3/1 tool=play_delivery r=0.013 | Nudged into the gap — a single.
143
+ over=1.5 score=3/1 tool=play_delivery r=0.003 | Nudged into the gap — dot ball.
144
+ over=2.0 score=3/1 tool=play_delivery r=0.002 | Driven through the covers — dot ball!
145
+ over=2.1 score=3/1 tool=play_delivery r=0.002 | Driven through the covers — dot ball!
146
+ over=2.2 score=3/2 tool=play_delivery r=-0.097 | Turned to leg — sharp catch at short fine leg. OUT!
147
+ over=2.3 score=3/2 tool=play_delivery r=0.003 | Driven through the covers — dot ball!
148
+ over=2.4 score=3/2 tool=play_delivery r=0.003 | Driven through the covers — dot ball!
149
+ over=2.5 score=3/2 tool=play_delivery r=0.003 | Nudged into the gap — dot ball.
150
+ over=3.0 score=3/2 tool=play_delivery r=0.003 | Launched over long-on — dot ball!
151
+ over=3.1 score=3/2 tool=play_delivery r=0.002 | Left outside off — dot ball.
152
+ over=3.2 score=7/2 tool=play_delivery r=0.062 | Defended solidly — a FOUR.
153
+ over=3.3 score=7/2 tool=play_delivery r=0.003 | Nudged into the gap — dot ball.
154
+ over=3.4 score=7/2 tool=play_delivery r=0.002 | Left outside off — dot ball.
155
+ over=3.5 score=7/2 tool=play_delivery r=0.002 | Defended solidly — dot ball.
156
+ over=4.0 score=7/2 tool=play_delivery r=0.003 | Launched over long-on — dot ball!
157
+ over=4.1 score=7/2 tool=play_delivery r=0.002 | Defended solidly — dot ball.
158
+ over=4.2 score=8/2 tool=play_delivery r=0.012 | Left outside off — a single.
159
+ over=4.3 score=8/2 tool=play_delivery r=0.003 | Worked off the hips — dot ball.
160
+ over=4.4 score=8/2 tool=play_delivery r=0.002 | Left outside off — dot ball.
161
+ over=4.5 score=8/2 tool=play_delivery r=0.003 | Worked off the hips — dot ball.
162
+ over=0.0 score=0/0 tool=play_delivery r=-0.217 | Edge runs toward straight — dot ball. Innings over. First in
163
+ over=0.1 score=2/1 tool=bowl_delivery r=0.104 | Lofted toward point — fielder settles under it. OUT!
164
+ over=0.2 score=3/1 tool=bowl_delivery r=-0.007 | Played toward point; deep fielder at deep_cover cuts off bou
165
+ over=0.3 score=3/1 tool=bowl_delivery r=0.023 | Driven through the covers — dot ball!
166
+ over=0.4 score=3/1 tool=bowl_delivery r=0.023 | Played toward point; inner fielder at point saves one — dot
167
+ over=0.4 score=4/1 tool=bowl_delivery r=-0.040 | Wide delivery — extra run added. Ball to be replayed.
168
+ over=0.4 score=5/1 tool=bowl_delivery r=-0.040 | Wide delivery — extra run added. Ball to be replayed.
169
+ over=0.5 score=5/1 tool=bowl_delivery r=0.023 | Driven through the covers — dot ball!
170
+ over=1.0 score=7/1 tool=bowl_delivery r=-0.066 | Driven through the covers — two runs!
171
+ over=1.1 score=7/1 tool=bowl_delivery r=0.023 | Driven through the covers — dot ball!
172
+ over=1.1 score=8/1 tool=bowl_delivery r=-0.040 | Wide delivery — extra run added. Ball to be replayed.
173
+ over=1.2 score=9/1 tool=bowl_delivery r=0.180 | Played toward cover; misfield — a single. Match over. Result
174
+ Episode 3/3 | Score: 9/1 (1 ov) | Reward: -0.937 | Coherence: 0.500 | Adapt: 0.521 | ParseErr: 0.0%
175
+
176
+ === Summary ===
177
+ total_score : mean=10.667 std=1.528
178
+ wickets_lost : mean=1.667 std=0.577
179
+ total_reward : mean=-0.789 std=1.041
180
+ mean_coherence : mean=0.572 std=0.063
181
+ parse_error_rate : mean=0.000 std=0.000
inference.py CHANGED
@@ -475,7 +475,7 @@ def main():
475
  parser.add_argument("--env-url", default=os.environ.get("CRICKET_CAPTAIN_ENV_URL", "ws://localhost:8000"))
476
  parser.add_argument("--eval-pack-id", default=os.environ.get("CRICKET_EVAL_PACK_ID", "default"))
477
  parser.add_argument("--opponent-mode", default=os.environ.get("CRICKET_OPPONENT_MODE", "llm_live"),
478
- choices=["heuristic", "llm_live", "llm_cached"])
479
  parser.add_argument("--api-base", default=None)
480
  parser.add_argument("--api-key", default=None)
481
  parser.add_argument("--verbose", action="store_true")
 
475
  parser.add_argument("--env-url", default=os.environ.get("CRICKET_CAPTAIN_ENV_URL", "ws://localhost:8000"))
476
  parser.add_argument("--eval-pack-id", default=os.environ.get("CRICKET_EVAL_PACK_ID", "default"))
477
  parser.add_argument("--opponent-mode", default=os.environ.get("CRICKET_OPPONENT_MODE", "llm_live"),
478
+ choices=["heuristic", "llm_live", "llm_cached", "cricsheet"])
479
  parser.add_argument("--api-base", default=None)
480
  parser.add_argument("--api-key", default=None)
481
  parser.add_argument("--verbose", action="store_true")
server/cricket_environment.py CHANGED
@@ -146,6 +146,7 @@ class CricketEnvironment(Environment):
146
  self._eval_pack_id = eval_pack_id
147
  self._opponent_mode = options.get("opponent_mode", os.environ.get("CRICKET_OPPONENT_MODE", "llm_live"))
148
  opponent_cache_path = options.get("opponent_cache_path", os.environ.get("CRICKET_OPPONENT_CACHE"))
 
149
 
150
  start_state = options.get("start_state", GameState.TOSS)
151
 
@@ -199,7 +200,9 @@ class CricketEnvironment(Environment):
199
  self._current_batter = dict(DEFAULT_BATTERS[0]) # striker
200
  self._non_striker = dict(DEFAULT_BATTERS[1]) # non-striker
201
  self._current_bowler = _default_bowler_for_type(self._bowler_type)
202
- self._opponent = create_opponent_policy(self._opponent_mode, self._rng, opponent_cache_path)
 
 
203
  # Load roster for the agent's team
204
  agent_team = options.get("agent_team", os.environ.get("CRICKET_AGENT_TEAM", "india"))
205
  self._agent_roster = load_team_roster(agent_team)
 
146
  self._eval_pack_id = eval_pack_id
147
  self._opponent_mode = options.get("opponent_mode", os.environ.get("CRICKET_OPPONENT_MODE", "llm_live"))
148
  opponent_cache_path = options.get("opponent_cache_path", os.environ.get("CRICKET_OPPONENT_CACHE"))
149
+ cricsheet_data_path = options.get("cricsheet_data_path", os.environ.get("CRICKET_CRICSHEET_DATA"))
150
 
151
  start_state = options.get("start_state", GameState.TOSS)
152
 
 
200
  self._current_batter = dict(DEFAULT_BATTERS[0]) # striker
201
  self._non_striker = dict(DEFAULT_BATTERS[1]) # non-striker
202
  self._current_bowler = _default_bowler_for_type(self._bowler_type)
203
+ self._opponent = create_opponent_policy(
204
+ self._opponent_mode, self._rng, opponent_cache_path, cricsheet_data_path
205
+ )
206
  # Load roster for the agent's team
207
  agent_team = options.get("agent_team", os.environ.get("CRICKET_AGENT_TEAM", "india"))
208
  self._agent_roster = load_team_roster(agent_team)
server/opponent_policy.py CHANGED
@@ -9,7 +9,9 @@ from __future__ import annotations
9
 
10
  import json
11
  import os
 
12
  import random
 
13
  from dataclasses import dataclass
14
  from pathlib import Path
15
  from typing import Any, Protocol
@@ -245,12 +247,132 @@ class LLMOpponentPolicy:
245
  return None
246
 
247
 
248
- def create_opponent_policy(mode: str, rng: random.Random, cache_path: str | None = None) -> OpponentPolicy:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
249
  fallback = HeuristicOpponentPolicy(rng=rng)
250
  if mode == "llm_live":
251
  return LLMOpponentPolicy(fallback=fallback)
252
  if mode == "llm_cached" and cache_path:
253
  return CachedOpponentPolicy(cache_path=cache_path, fallback=fallback)
 
 
254
  return fallback
255
 
256
 
 
9
 
10
  import json
11
  import os
12
+ import pickle
13
  import random
14
+ from collections import defaultdict
15
  from dataclasses import dataclass
16
  from pathlib import Path
17
  from typing import Any, Protocol
 
247
  return None
248
 
249
 
250
+ _RUNS_TO_INTENT = {0: "defensive", 1: "single", 2: "rotate", 3: "rotate", 4: "boundary", 6: "six"}
251
+ _DEFAULT_OUTCOMES_PATH = os.path.join(
252
+ os.path.dirname(__file__), "..", "data", "processed", "ball_outcomes_odi_v1.pkl"
253
+ )
254
+
255
+
256
+ class CricsheetOpponentPolicy:
257
+ """Opponent policy driven by real Cricsheet ball-by-ball data.
258
+
259
+ Deliveries are indexed by (phase, wickets_band, innings_type) — loose enough
260
+ to always find matches, specific enough to respect match context. A random
261
+ sample from the matching bucket drives the batting/bowling plan; the heuristic
262
+ fallback covers gaps.
263
+ """
264
+
265
+ mode = "cricsheet"
266
+
267
+ def __init__(
268
+ self,
269
+ fallback: OpponentPolicy,
270
+ rng: random.Random,
271
+ data_path: str | os.PathLike[str] | None = None,
272
+ ):
273
+ self._fallback = fallback
274
+ self._rng = rng
275
+ self._index: dict[tuple, list[dict]] = defaultdict(list)
276
+ path = Path(data_path or _DEFAULT_OUTCOMES_PATH)
277
+ if path.exists():
278
+ with path.open("rb") as f:
279
+ records: list[dict] = pickle.load(f)
280
+ for rec in records:
281
+ key = (
282
+ rec.get("phase", "middle"),
283
+ _wickets_band(rec.get("wickets_before", 0)),
284
+ rec.get("innings_type", "first"),
285
+ )
286
+ self._index[key].append(rec)
287
+
288
+ def _sample(self, context: dict[str, Any]) -> dict[str, Any] | None:
289
+ phase = context.get("phase", "middle").lower()
290
+ wickets = int(context.get("wickets", 0))
291
+ innings_type = context.get("innings_type", "first")
292
+ key = (phase, _wickets_band(wickets), innings_type)
293
+ bucket = self._index.get(key)
294
+ if not bucket:
295
+ # Widen: drop innings_type
296
+ bucket = self._index.get((phase, _wickets_band(wickets), "first"))
297
+ if not bucket:
298
+ # Widen further: drop wickets band
299
+ all_phase = [r for k, v in self._index.items() if k[0] == phase for r in v]
300
+ bucket = all_phase or None
301
+ return self._rng.choice(bucket) if bucket else None
302
+
303
+ def batting_plan(self, context: dict[str, Any]) -> dict[str, Any]:
304
+ rec = self._sample(context)
305
+ if rec is None:
306
+ return self._fallback.batting_plan(context)
307
+ runs = rec.get("runs_batter", 0)
308
+ wicket = rec.get("wicket", False)
309
+ if wicket:
310
+ intent = self._rng.choice(["boundary", "six"]) # risky shot that got out
311
+ else:
312
+ intent = _RUNS_TO_INTENT.get(runs, "single" if runs > 0 else "defensive")
313
+ aggression = _intent_aggression(intent)
314
+ return {
315
+ "shot_intent": intent,
316
+ "target_area": self._rng.choice(_zones_for_intent(intent)),
317
+ "risk": "high" if intent in {"boundary", "six"} else "balanced",
318
+ "aggression": aggression,
319
+ "batter_role": f"cricsheet_{rec.get('phase','mid')}",
320
+ "rationale": (
321
+ f"Cricsheet sample: {runs}r wicket={wicket} "
322
+ f"phase={rec.get('phase')} wkts={rec.get('wickets_before')}."
323
+ ),
324
+ }
325
+
326
+ def bowling_plan(self, context: dict[str, Any]) -> dict[str, Any]:
327
+ rec = self._sample(context)
328
+ if rec is None:
329
+ return self._fallback.bowling_plan(context)
330
+ over = int(context.get("over", 0))
331
+ max_overs = context.get("max_overs")
332
+ bowler_type = rec.get("bowler_type") or "pace"
333
+ strategy = get_bowling_strategy(over, max_overs)
334
+ return {
335
+ "bowler_type": bowler_type,
336
+ "line": strategy.get("line", "outside off"),
337
+ "length": strategy.get("length", "good length"),
338
+ "delivery_type": strategy.get("delivery_type", "stock"),
339
+ "field_setting": strategy.get("field_setting", "Balanced"),
340
+ "bowler_role": f"cricsheet_{bowler_type}",
341
+ "rationale": (
342
+ f"Cricsheet sample: {bowler_type} bowler "
343
+ f"phase={rec.get('phase')} wkts={rec.get('wickets_before')}."
344
+ ),
345
+ }
346
+
347
+ def reflect_after_ball(self, context: dict[str, Any], outcome: dict[str, Any]) -> dict[str, Any]:
348
+ return self._fallback.reflect_after_ball(context, outcome)
349
+
350
+
351
+ def _wickets_band(wickets: int) -> str:
352
+ if wickets <= 2:
353
+ return "0-2"
354
+ if wickets <= 5:
355
+ return "3-5"
356
+ return "6-9"
357
+
358
+
359
+ def _intent_aggression(intent: str) -> float:
360
+ return {"leave": 0.05, "defensive": 0.15, "single": 0.35, "rotate": 0.45, "boundary": 0.75, "six": 0.90}.get(intent, 0.40)
361
+
362
+
363
+ def create_opponent_policy(
364
+ mode: str,
365
+ rng: random.Random,
366
+ cache_path: str | None = None,
367
+ cricsheet_data_path: str | None = None,
368
+ ) -> OpponentPolicy:
369
  fallback = HeuristicOpponentPolicy(rng=rng)
370
  if mode == "llm_live":
371
  return LLMOpponentPolicy(fallback=fallback)
372
  if mode == "llm_cached" and cache_path:
373
  return CachedOpponentPolicy(cache_path=cache_path, fallback=fallback)
374
+ if mode == "cricsheet":
375
+ return CricsheetOpponentPolicy(fallback=fallback, rng=rng, data_path=cricsheet_data_path)
376
  return fallback
377
 
378
 
train.py CHANGED
@@ -909,7 +909,7 @@ def main():
909
  smoke.add_argument("--max-steps", type=int, default=240, dest="max_steps")
910
  smoke.add_argument("--log-steps", type=int, default=30, dest="log_steps")
911
  smoke.add_argument("--eval-pack-id", default=None, dest="eval_pack_id")
912
- smoke.add_argument("--opponent-mode", default=None, choices=["heuristic", "llm_live", "llm_cached"], dest="opponent_mode")
913
  smoke.add_argument("--opponent-cache-path", default=None, dest="opponent_cache_path")
914
  smoke.add_argument("--output", default=None)
915
  smoke.add_argument("--seed", type=int, default=42)
 
909
  smoke.add_argument("--max-steps", type=int, default=240, dest="max_steps")
910
  smoke.add_argument("--log-steps", type=int, default=30, dest="log_steps")
911
  smoke.add_argument("--eval-pack-id", default=None, dest="eval_pack_id")
912
+ smoke.add_argument("--opponent-mode", default=None, choices=["heuristic", "llm_live", "llm_cached", "cricsheet"], dest="opponent_mode")
913
  smoke.add_argument("--opponent-cache-path", default=None, dest="opponent_cache_path")
914
  smoke.add_argument("--output", default=None)
915
  smoke.add_argument("--seed", type=int, default=42)