t22000t commited on
Commit
16eaadc
·
0 Parent(s):

Initial commit: optcg-deck-builder Gradio Space

Browse files

Auto-generates legal 50-card OPTCG decks anchored on a chosen Leader.
Same color-legality + family-bonus synergy ranker as the explorer
Space, layered with three cost-curve presets (aggro/midrange/control)
and a 4-copies-per-card cap. Always produces exactly 50 cards: cost
buckets fill first, then a backfill pass picks up any deficit from the
remaining top-synergy candidates. No encoder loaded - the leader's
vector comes from the precomputed corpus matrix, so cold start is
seconds rather than tens of seconds. 30 hermetic tests cover the size
invariant, color legality, copy cap, leader exclusion, style
sensitivity, plain-text export, and the breakdown plots.

.gitignore ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *.so
5
+
6
+ # Virtual environments
7
+ .venv/
8
+ venv/
9
+
10
+ # Testing
11
+ .pytest_cache/
12
+ .coverage
13
+
14
+ # Linters
15
+ .ruff_cache/
16
+ .mypy_cache/
17
+
18
+ # IDEs
19
+ .vscode/
20
+ .idea/
21
+ *.swp
22
+
23
+ # OS
24
+ .DS_Store
25
+
26
+ # Environment
27
+ .env
28
+ .env.local
29
+
30
+ # HuggingFace
31
+ .cache/huggingface/
32
+
33
+ # Claude Code
34
+ CLAUDE.md
35
+ .claude/
36
+ .reliability-mode
README.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: OPTCG Deck Builder
3
+ colorFrom: red
4
+ colorTo: blue
5
+ sdk: gradio
6
+ sdk_version: 5.49.1
7
+ python_version: "3.12"
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ short_description: Auto-build legal 50-card One Piece TCG decks
12
+ ---
13
+
14
+ # OPTCG Deck Builder
15
+
16
+ Pick any **Leader** from the [optcg-en-card-embeddings](https://huggingface.co/datasets/t22000t/optcg-en-card-embeddings) dataset and the Space generates a legal 50-card OPTCG deck around it. The ranker layers OPTCG deckbuilding rules on top of the Qwen3-Embedding-derived synergy score:
17
+
18
+ - **Color legality** - every card shares at least one color with the Leader.
19
+ - **Family bonus** - cards in the Leader's family (Straw Hat Crew, Marines, etc.) get a +0.10 synergy boost so they outrank merely-similar off-archetype options.
20
+ - **Copy cap** - each card appears at most 4 times (the OPTCG standard).
21
+ - **Cost curve targeting** - one of three style presets shapes how slots are distributed across costs.
22
+
23
+ Sister Space: [OPTCG Card Explorer](https://huggingface.co/spaces/t22000t/optcg-explorer) - semantic search, UMAP scatter, similar-cards browser.
24
+
25
+ ## Style presets
26
+
27
+ Each preset is a target distribution that sums to 50 cards. The builder fills cost buckets in order, and any deficit at the end spills into a backfill pass over remaining top-synergy cards (so the total is *always* 50).
28
+
29
+ | Cost | aggro | midrange | control |
30
+ |--------|------:|---------:|--------:|
31
+ | 1 | 4 | 0 | 0 |
32
+ | 2 | 12 | 6 | 4 |
33
+ | 3 | 12 | 10 | 8 |
34
+ | 4 | 8 | 10 | 8 |
35
+ | 5 | 6 | 8 | 8 |
36
+ | 6 | 4 | 8 | 8 |
37
+ | 7 | 2 | 4 | 6 |
38
+ | 8+ | 2 | 4 | 8 |
39
+
40
+ ## What the Space does *not* do
41
+
42
+ - No archetype/strategy detection (it does not know that, say, "Monkey.D.Luffy / OP01-001" is the aggro leader).
43
+ - No banlist or meta awareness.
44
+ - No card art (this is a structured-data / text-only project by design - see [parent project](https://github.com/timothy22000/optcg-cards) for the IP rationale).
45
+ - The result is a sketch you tweak, not a tournament-ready list.
46
+
47
+ ## Configuration
48
+
49
+ - `HF_TOKEN` (Space secret) - required only while the source dataset stays private.
50
+
51
+ ## Development
52
+
53
+ ```bash
54
+ pip install -r requirements.txt
55
+ export HF_TOKEN=hf_... # only if the source dataset is private
56
+ python app.py
57
+ ```
58
+
59
+ ```bash
60
+ pytest -v # 21 hermetic tests
61
+ ```
62
+
63
+ ## License
64
+
65
+ MIT. Card data via the [vegapull](https://github.com/Coko7/vegapull) scraper. Not affiliated with Bandai or the One Piece Card Game.
app.py ADDED
@@ -0,0 +1,391 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """OPTCG Deck Builder - Gradio Space.
2
+
3
+ Auto-generates a 50-card OPTCG deck anchored on a chosen Leader. The
4
+ ranker is the same color-legality + family-bonus synergy used in the
5
+ explorer Space; the deck builder layers a target cost curve on top
6
+ (aggro/midrange/control presets) and a 4-copies-per-card cap.
7
+
8
+ No embedding model is loaded - the leader's vector is read directly
9
+ from the precomputed corpus matrix. That keeps cold start to seconds
10
+ and the Space lightweight on a free CPU runner.
11
+
12
+ Data source: https://huggingface.co/datasets/t22000t/optcg-en-card-embeddings
13
+ """
14
+
15
+ from __future__ import annotations
16
+
17
+ import logging
18
+ import os
19
+ from typing import Any
20
+
21
+ import gradio as gr
22
+
23
+ from spaceutil.data import load_corpus
24
+ from spaceutil.deck import COST_CURVES, Deck, build_deck, deck_to_text
25
+ from spaceutil.plot import (
26
+ build_color_breakdown_figure,
27
+ build_cost_curve_figure,
28
+ build_type_breakdown_figure,
29
+ )
30
+
31
+ logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(name)s: %(message)s")
32
+ logger = logging.getLogger("optcg-deck-builder")
33
+
34
+
35
+ # ----------------------------------------------------------------------------
36
+ # Startup
37
+ # ----------------------------------------------------------------------------
38
+
39
+ logger.info("Loading corpus from HF Hub...")
40
+ CARDS, MATRIX, EMBED_PROV, ID_TO_IDX = load_corpus(token=os.environ.get("HF_TOKEN"))
41
+ logger.info("Corpus loaded: %d cards, matrix shape %s", len(CARDS), MATRIX.shape)
42
+
43
+ LEADER_CHOICES = sorted(
44
+ f"{c['name']} ({c['id']})"
45
+ for c in CARDS
46
+ if c.get("card_type") == "Leader"
47
+ )
48
+
49
+ N_CARDS = len(CARDS)
50
+ N_SETS = len({c.get("set_code") for c in CARDS if c.get("set_code")})
51
+ N_LEADERS = sum(1 for c in CARDS if c.get("card_type") == "Leader")
52
+ LATEST_SET = max((c.get("set_code") or "" for c in CARDS), default="?")
53
+ EMBEDDING_DIM = MATRIX.shape[1]
54
+ STYLE_OPTIONS = sorted(COST_CURVES.keys())
55
+
56
+
57
+ # ----------------------------------------------------------------------------
58
+ # Display helpers
59
+ # ----------------------------------------------------------------------------
60
+
61
+ DECK_HEADERS = [
62
+ "Qty", "ID", "Name", "Cost", "Type", "Synergy", "Family", "Colors", "Set",
63
+ ]
64
+
65
+
66
+ def deck_to_rows(deck: Deck | None) -> list[list[Any]]:
67
+ if deck is None:
68
+ return []
69
+ return [
70
+ [
71
+ dc.quantity,
72
+ dc.card_id,
73
+ dc.name,
74
+ dc.cost if dc.cost is not None else "-",
75
+ dc.card_type,
76
+ round(dc.synergy_score, 4),
77
+ "yes" if dc.family_match else "no",
78
+ ", ".join(dc.colors),
79
+ dc.set_code,
80
+ ]
81
+ for dc in deck.cards
82
+ ]
83
+
84
+
85
+ LEADER_DETAIL_FIELDS = (
86
+ ("card_type", "Type"),
87
+ ("colors", "Colors"),
88
+ ("life", "Life"),
89
+ ("attribute", "Attribute"),
90
+ ("family", "Family"),
91
+ ("rarity", "Rarity"),
92
+ ("set_name", "Set"),
93
+ ("effect_text", "Effect"),
94
+ )
95
+
96
+
97
+ def _fmt_value(value: Any) -> str:
98
+ if value is None or value == "":
99
+ return "-"
100
+ if isinstance(value, list):
101
+ return ", ".join(str(v) for v in value) if value else "-"
102
+ return str(value)
103
+
104
+
105
+ def format_leader_detail(card: dict[str, Any]) -> str:
106
+ lines = [f"### {card.get('name', '?')}\n`{card.get('id', '?')}`\n"]
107
+ for key, label in LEADER_DETAIL_FIELDS:
108
+ lines.append(f"**{label}:** {_fmt_value(card.get(key))}")
109
+ return "\n\n".join(lines)
110
+
111
+
112
+ def format_summary(deck: Deck | None) -> str:
113
+ if deck is None:
114
+ return "*Pick a leader and click Build deck.*"
115
+ lines = [
116
+ f"**Total cards:** {deck.total_quantity} / 50",
117
+ f"**Average cost:** {deck.avg_cost:.2f}",
118
+ f"**Style:** {deck.style}",
119
+ f"**Family-match cards:** {deck.family_match_count} / {deck.total_quantity}",
120
+ f"**Unique cards:** {len(deck.cards)}",
121
+ ]
122
+ return "\n\n".join(lines)
123
+
124
+
125
+ # ----------------------------------------------------------------------------
126
+ # Event handlers
127
+ # ----------------------------------------------------------------------------
128
+
129
+
130
+ def _selection_to_idx(selection: str) -> int | None:
131
+ if not selection:
132
+ return None
133
+ if "(" in selection and selection.endswith(")"):
134
+ card_id = selection.rsplit("(", 1)[1][:-1]
135
+ else:
136
+ card_id = selection
137
+ return ID_TO_IDX.get(card_id)
138
+
139
+
140
+ def on_leader_change(selection: str):
141
+ idx = _selection_to_idx(selection)
142
+ if idx is None:
143
+ return "*Pick a Leader to see its details.*"
144
+ return format_leader_detail(CARDS[idx])
145
+
146
+
147
+ def on_build(selection: str, style: str, max_copies: int):
148
+ idx = _selection_to_idx(selection)
149
+ if idx is None:
150
+ empty_fig = build_cost_curve_figure(None)
151
+ return (
152
+ "*Pick a Leader first.*",
153
+ "*Pick a Leader first.*",
154
+ empty_fig,
155
+ build_type_breakdown_figure(None),
156
+ build_color_breakdown_figure(None),
157
+ [],
158
+ "",
159
+ )
160
+ deck = build_deck(idx, CARDS, MATRIX, style=style, max_copies=int(max_copies))
161
+ return (
162
+ format_leader_detail(CARDS[idx]),
163
+ format_summary(deck),
164
+ build_cost_curve_figure(deck),
165
+ build_type_breakdown_figure(deck),
166
+ build_color_breakdown_figure(deck),
167
+ deck_to_rows(deck),
168
+ deck_to_text(deck),
169
+ )
170
+
171
+
172
+ # ----------------------------------------------------------------------------
173
+ # UI
174
+ # ----------------------------------------------------------------------------
175
+
176
+ CUSTOM_CSS = """
177
+ .gradio-container { max-width: 1280px !important; margin: 0 auto !important; }
178
+ #header-row h1 { margin-bottom: 0.25em; }
179
+ #header-row .subtitle { color: var(--body-text-color-subdued); margin-top: 0; }
180
+ .stats-pill {
181
+ display: inline-block;
182
+ padding: 4px 10px;
183
+ margin: 2px 4px 2px 0;
184
+ border-radius: 12px;
185
+ background: var(--background-fill-secondary);
186
+ border: 1px solid var(--border-color-primary);
187
+ font-size: 0.85em;
188
+ }
189
+ .muted { color: var(--body-text-color-subdued); font-size: 0.9em; }
190
+ """
191
+
192
+ INSTRUCTIONS_MD = f"""
193
+ **How to build a deck**
194
+
195
+ 1. **Pick a Leader** from the dropdown (~{N_LEADERS} leaders in the corpus). The Leader anchors color identity, archetype, and the synergy scoring.
196
+ 2. **Choose a style**:
197
+ - `aggro` - cost curve weighted to 1-3, flood the early board.
198
+ - `midrange` - 3-6 cost dominant, the safe default.
199
+ - `control` - 4-8 cost weighted, bigger threats and fewer turns to defend.
200
+ 3. **Set max copies** (1-4). Standard OPTCG rules cap at 4. Lowering the cap forces more variety.
201
+ 4. Click **Build deck**. You get a 50-card list, a cost-curve check against the target preset, and type/color breakdowns.
202
+ 5. **Export** the plain-text deck list at the bottom and paste into your sim of choice.
203
+
204
+ **What the builder does**
205
+
206
+ It scores every color-legal candidate by `cosine_similarity(leader, card) + family_bonus`, then walks the chosen cost curve bucket by bucket, taking top-synergy cards (up to `max_copies` each) until each bucket is filled. If a bucket is short, the deficit spills into a backfill pass that takes the remaining best-synergy cards regardless of cost - the deck total is *always* exactly 50.
207
+
208
+ **What it does not do (yet)**
209
+
210
+ - No archetype/strategy detection (it doesn't know whether your leader is "the rush leader" or "the control leader").
211
+ - No banlist or competitive-meta awareness.
212
+ - No DON!! deck (always 10, hardcoded across the game).
213
+ - The result is a *starting point* for tweaking, not a tournament-ready list.
214
+ """
215
+
216
+ ABOUT_MD = f"""
217
+ ### How synergy is scored
218
+
219
+ Each card gets a score of `cosine_similarity(leader_vector, card_vector) + 0.10 if same_family else 0`. Vectors come from `Qwen/Qwen3-Embedding-0.6B` ({EMBEDDING_DIM}-dim, L2-normalized) on the published [optcg-en-card-embeddings](https://huggingface.co/datasets/t22000t/optcg-en-card-embeddings) dataset.
220
+
221
+ Color legality is a hard filter (you must share at least one color with the leader). Other Leader cards are dropped.
222
+
223
+ ### Why styles matter
224
+
225
+ Two decks built around the same leader can play very differently depending on cost distribution. The presets are deliberately blunt - they're starting shapes, not optimized curves:
226
+
227
+ - aggro: 4-12-12-8-6-4-2-2 (sum 50)
228
+ - midrange: 0-6-10-10-8-8-4-4
229
+ - control: 0-4-8-8-8-8-6-8
230
+
231
+ ### Source
232
+
233
+ Card data from [vegapull](https://github.com/Coko7/vegapull) scraping the official One Piece Card Game site. Pipeline: [github.com/timothy22000/optcg-cards](https://github.com/timothy22000/optcg-cards). Sister demo: [OPTCG Card Explorer](https://huggingface.co/spaces/t22000t/optcg-explorer) (semantic search + UMAP + similar-cards browser).
234
+
235
+ Not affiliated with Bandai or the One Piece Card Game.
236
+ """
237
+
238
+
239
+ with gr.Blocks(
240
+ title="OPTCG Deck Builder",
241
+ theme=gr.themes.Soft(primary_hue="red", secondary_hue="blue"),
242
+ css=CUSTOM_CSS,
243
+ ) as demo:
244
+ # ----- Header -----
245
+ with gr.Row(elem_id="header-row"):
246
+ gr.Markdown(
247
+ f"""# OPTCG Deck Builder
248
+ <p class="subtitle">Auto-generate a legal 50-card One Piece Card Game deck anchored on any Leader.</p>
249
+
250
+ <div>
251
+ <span class="stats-pill"><b>{N_CARDS}</b> cards</span>
252
+ <span class="stats-pill"><b>{N_LEADERS}</b> leaders</span>
253
+ <span class="stats-pill"><b>{N_SETS}</b> sets</span>
254
+ <span class="stats-pill">latest <b>{LATEST_SET}</b></span>
255
+ <span class="stats-pill">3 styles</span>
256
+ </div>
257
+
258
+ <p class="muted">Dataset: <a href="https://huggingface.co/datasets/t22000t/optcg-en-card-embeddings" target="_blank">t22000t/optcg-en-card-embeddings</a> &nbsp;&middot;&nbsp; Code: <a href="https://github.com/timothy22000/optcg-cards" target="_blank">github.com/timothy22000/optcg-cards</a> &nbsp;&middot;&nbsp; Sister: <a href="https://huggingface.co/spaces/t22000t/optcg-explorer" target="_blank">OPTCG Card Explorer</a></p>
259
+ """
260
+ )
261
+
262
+ # ----- Instructions -----
263
+ with gr.Accordion("How to use this Space", open=True):
264
+ gr.Markdown(INSTRUCTIONS_MD)
265
+
266
+ # ----- Controls -----
267
+ with gr.Row():
268
+ leader_picker = gr.Dropdown(
269
+ choices=LEADER_CHOICES,
270
+ label="Leader",
271
+ value=None,
272
+ allow_custom_value=False,
273
+ filterable=True,
274
+ info="Type to filter by name or card ID.",
275
+ scale=4,
276
+ )
277
+ style_picker = gr.Dropdown(
278
+ choices=STYLE_OPTIONS,
279
+ value="midrange",
280
+ label="Style",
281
+ scale=1,
282
+ )
283
+ max_copies_slider = gr.Slider(
284
+ minimum=1, maximum=4, value=4, step=1,
285
+ label="Max copies",
286
+ scale=1,
287
+ )
288
+ build_btn = gr.Button("Build deck", variant="primary", scale=1)
289
+
290
+ # ----- Leader detail + summary -----
291
+ with gr.Row():
292
+ leader_detail_md = gr.Markdown(
293
+ "*Pick a Leader to see its details.*",
294
+ label="Leader",
295
+ )
296
+ summary_md = gr.Markdown(
297
+ "*Pick a Leader and click Build deck.*",
298
+ label="Deck summary",
299
+ )
300
+
301
+ # ----- Charts -----
302
+ with gr.Row():
303
+ cost_curve_plot = gr.Plot(
304
+ value=build_cost_curve_figure(None),
305
+ label="Cost curve",
306
+ )
307
+ with gr.Row():
308
+ with gr.Column():
309
+ type_plot = gr.Plot(
310
+ value=build_type_breakdown_figure(None),
311
+ label="Type mix",
312
+ )
313
+ with gr.Column():
314
+ color_plot = gr.Plot(
315
+ value=build_color_breakdown_figure(None),
316
+ label="Color mix",
317
+ )
318
+
319
+ # ----- Deck list -----
320
+ gr.Markdown("### Deck list (sorted by cost, then synergy)")
321
+ deck_df = gr.Dataframe(
322
+ headers=DECK_HEADERS,
323
+ value=[],
324
+ label="Cards",
325
+ interactive=False,
326
+ wrap=True,
327
+ column_widths=["6%", "11%", "26%", "6%", "11%", "10%", "8%", "14%", "8%"],
328
+ )
329
+
330
+ # ----- Export -----
331
+ with gr.Accordion("Plain-text export", open=False):
332
+ gr.Markdown(
333
+ "Copy this and paste into your sim of choice. Format: one card per line, "
334
+ "`<qty>x <ID> <Name>`."
335
+ )
336
+ export_text = gr.Textbox(
337
+ value="",
338
+ label="Deck text",
339
+ lines=12,
340
+ max_lines=20,
341
+ show_copy_button=True,
342
+ interactive=False,
343
+ )
344
+
345
+ # ----- About -----
346
+ with gr.Accordion("About this Space", open=False):
347
+ gr.Markdown(ABOUT_MD)
348
+
349
+ gr.Markdown(
350
+ '<div class="muted" style="text-align:center; padding:12px 0;">'
351
+ 'Built with <a href="https://gradio.app" target="_blank">Gradio</a>. '
352
+ 'Embeddings: Qwen/Qwen3-Embedding-0.6B. '
353
+ 'Card data via <a href="https://github.com/Coko7/vegapull" target="_blank">vegapull</a>. '
354
+ 'Not affiliated with Bandai or the One Piece Card Game.'
355
+ '</div>'
356
+ )
357
+
358
+ # ----- Wiring -----
359
+ leader_picker.change(
360
+ on_leader_change,
361
+ inputs=[leader_picker],
362
+ outputs=[leader_detail_md],
363
+ )
364
+ build_outputs = [
365
+ leader_detail_md,
366
+ summary_md,
367
+ cost_curve_plot,
368
+ type_plot,
369
+ color_plot,
370
+ deck_df,
371
+ export_text,
372
+ ]
373
+ build_btn.click(
374
+ on_build,
375
+ inputs=[leader_picker, style_picker, max_copies_slider],
376
+ outputs=build_outputs,
377
+ )
378
+ style_picker.change(
379
+ on_build,
380
+ inputs=[leader_picker, style_picker, max_copies_slider],
381
+ outputs=build_outputs,
382
+ )
383
+ max_copies_slider.change(
384
+ on_build,
385
+ inputs=[leader_picker, style_picker, max_copies_slider],
386
+ outputs=build_outputs,
387
+ )
388
+
389
+
390
+ if __name__ == "__main__":
391
+ demo.launch()
pyproject.toml ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [tool.pytest.ini_options]
2
+ testpaths = ["tests"]
3
+ addopts = "-ra -q --strict-markers -m 'not network'"
4
+ markers = [
5
+ "network: tests that hit the network (opt-in via `-m network`)",
6
+ ]
7
+
8
+ [tool.ruff]
9
+ line-length = 100
10
+ target-version = "py310"
11
+
12
+ [tool.ruff.lint]
13
+ select = ["E", "F", "W", "I", "B", "UP", "SIM"]
14
+ ignore = ["E501"]
requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ gradio>=5.0,<6
2
+ huggingface_hub>=0.30
3
+ plotly>=5.20
4
+ optcg-cards @ git+https://github.com/timothy22000/optcg-cards@v0.1.0
spaceutil/__init__.py ADDED
File without changes
spaceutil/data.py ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Load the published OPTCG embeddings corpus from HF Hub.
2
+
3
+ Pulls `cards_with_embeddings.parquet` and `provenance.json` from the
4
+ configured dataset repo, applies the same numpy-array-to-list coercion
5
+ that the upstream CLI uses, and stacks the embedding column into a
6
+ single float32 matrix that downstream code reuses without restacking.
7
+ """
8
+
9
+ from __future__ import annotations
10
+
11
+ import logging
12
+ from pathlib import Path
13
+ from typing import Any
14
+
15
+ import numpy as np
16
+ import pandas as pd
17
+ from huggingface_hub import hf_hub_download
18
+ from optcg_cards.provenance import EmbedProvenance, read_provenance
19
+
20
+ logger = logging.getLogger(__name__)
21
+
22
+ REPO_ID = "t22000t/optcg-en-card-embeddings"
23
+ PARQUET_FILE = "cards_with_embeddings.parquet"
24
+ PROVENANCE_FILE = "provenance.json"
25
+
26
+
27
+ def load_corpus(
28
+ token: str | None,
29
+ ) -> tuple[list[dict[str, Any]], np.ndarray, EmbedProvenance, dict[str, int]]:
30
+ """Return `(cards, matrix, embed_provenance, id_to_idx)` for the
31
+ published embeddings corpus.
32
+
33
+ The `embedding` column is stripped from `cards` after stacking into
34
+ `matrix`. All list-typed columns are coerced to plain Python lists.
35
+ The token is passed to `hf_hub_download` but never written to logs.
36
+ """
37
+ logger.info(
38
+ "Loading corpus from %s (authenticated=%s)",
39
+ REPO_ID,
40
+ "yes" if token else "no",
41
+ )
42
+ parquet_path = hf_hub_download(
43
+ repo_id=REPO_ID,
44
+ filename=PARQUET_FILE,
45
+ repo_type="dataset",
46
+ token=token,
47
+ )
48
+ prov_path = hf_hub_download(
49
+ repo_id=REPO_ID,
50
+ filename=PROVENANCE_FILE,
51
+ repo_type="dataset",
52
+ token=token,
53
+ )
54
+
55
+ cards = _read_parquet_records(Path(parquet_path))
56
+ if not cards:
57
+ raise RuntimeError("Embeddings parquet returned 0 rows")
58
+
59
+ matrix = np.stack(
60
+ [np.asarray(c["embedding"], dtype=np.float32) for c in cards],
61
+ axis=0,
62
+ )
63
+
64
+ for card in cards:
65
+ card.pop("embedding", None)
66
+
67
+ id_to_idx = {card["id"]: i for i, card in enumerate(cards)}
68
+
69
+ _, embed_prov = read_provenance(Path(prov_path))
70
+ if embed_prov is None:
71
+ raise RuntimeError("Embeddings provenance is missing the `embed` block")
72
+
73
+ return cards, matrix, embed_prov, id_to_idx
74
+
75
+
76
+ def _read_parquet_records(path: Path) -> list[dict[str, Any]]:
77
+ # Mirrors the coercion loop in optcg_cards.cli._read_parquet
78
+ # (cli.py:429-443). Pandas materializes list-typed parquet columns
79
+ # as ndarrays; downstream code expects plain Python lists.
80
+ df = pd.read_parquet(str(path))
81
+ records = df.to_dict(orient="records")
82
+ for record in records:
83
+ for key, value in record.items():
84
+ if isinstance(value, np.ndarray):
85
+ record[key] = value.tolist()
86
+ return records
spaceutil/deck.py ADDED
@@ -0,0 +1,263 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Auto-build a 50-card OPTCG deck around a chosen Leader.
2
+
3
+ Algorithm
4
+ ---------
5
+
6
+ 1. Score every color-legal candidate via `recommend_synergy`
7
+ (cosine_similarity to leader + family bonus). Other Leader cards
8
+ and the chosen leader itself are excluded by the synergy step.
9
+
10
+ 2. Walk a *target cost curve* (chosen by `style`) bucket by bucket.
11
+ For each bucket, take top-synergy cards in that cost slot, assigning
12
+ up to `max_copies` per card, until the bucket is filled.
13
+
14
+ 3. If a cost bucket has insufficient candidates (rare in real corpora,
15
+ common in narrow synthetic fixtures), the deficit spills into a
16
+ final backfill pass that consumes the highest-synergy remaining
17
+ cards regardless of cost. Backfill respects the per-card copy cap.
18
+
19
+ The result is always exactly 50 cards. Color legality and the copy cap
20
+ are hard invariants enforced at every step.
21
+
22
+ Cost curve presets
23
+ ------------------
24
+
25
+ - aggro: weighted to 1-3 cost - flood the early board.
26
+ - midrange: 3-6 cost dominant - the safe default.
27
+ - control: 4-8 cost weighted - bigger threats, less early presence.
28
+
29
+ Each preset is a dict[int, int] summing to exactly 50.
30
+ """
31
+
32
+ from __future__ import annotations
33
+
34
+ from collections import defaultdict
35
+ from dataclasses import dataclass, field
36
+ from typing import Any
37
+
38
+ import numpy as np
39
+
40
+ from spaceutil.synergy import recommend_synergy
41
+
42
+ DEFAULT_DECK_SIZE = 50
43
+ DEFAULT_MAX_COPIES = 4
44
+
45
+ # Each preset must sum to exactly DEFAULT_DECK_SIZE. The cost-8 bucket
46
+ # captures everything 8+ (8, 9, 10, ...).
47
+ COST_CURVES: dict[str, dict[int, int]] = {
48
+ "aggro": {1: 4, 2: 12, 3: 12, 4: 8, 5: 6, 6: 4, 7: 2, 8: 2},
49
+ "midrange": {1: 0, 2: 6, 3: 10, 4: 10, 5: 8, 6: 8, 7: 4, 8: 4},
50
+ "control": {1: 0, 2: 4, 3: 8, 4: 8, 5: 8, 6: 8, 7: 6, 8: 8},
51
+ }
52
+
53
+
54
+ @dataclass(frozen=True)
55
+ class DeckCard:
56
+ card_id: str
57
+ name: str
58
+ quantity: int
59
+ cost: int | None
60
+ card_type: str
61
+ colors: list[str]
62
+ family: list[str]
63
+ rarity: str
64
+ set_code: str
65
+ synergy_score: float
66
+ family_match: bool
67
+
68
+
69
+ @dataclass(frozen=True)
70
+ class Deck:
71
+ leader: dict[str, Any]
72
+ cards: list[DeckCard]
73
+ style: str
74
+ target_curve: dict[int, int] = field(default_factory=dict)
75
+
76
+ @property
77
+ def total_quantity(self) -> int:
78
+ return sum(c.quantity for c in self.cards)
79
+
80
+ @property
81
+ def total_cost(self) -> int:
82
+ return sum((c.cost or 0) * c.quantity for c in self.cards)
83
+
84
+ @property
85
+ def avg_cost(self) -> float:
86
+ # Average is over the cards that actually have a cost (Stages
87
+ # without cost are excluded from the denominator). For typical
88
+ # OPTCG decks ~all cards have cost, so this matches intuition.
89
+ priced = [c for c in self.cards if c.cost is not None]
90
+ total_qty = sum(c.quantity for c in priced)
91
+ if total_qty == 0:
92
+ return 0.0
93
+ return sum((c.cost or 0) * c.quantity for c in priced) / total_qty
94
+
95
+ @property
96
+ def cost_distribution(self) -> dict[int, int]:
97
+ dist: dict[int, int] = {}
98
+ for c in self.cards:
99
+ if c.cost is None:
100
+ continue
101
+ bucket = min(int(c.cost), 8)
102
+ dist[bucket] = dist.get(bucket, 0) + c.quantity
103
+ return dist
104
+
105
+ @property
106
+ def type_distribution(self) -> dict[str, int]:
107
+ dist: dict[str, int] = {}
108
+ for c in self.cards:
109
+ dist[c.card_type] = dist.get(c.card_type, 0) + c.quantity
110
+ return dist
111
+
112
+ @property
113
+ def color_distribution(self) -> dict[str, int]:
114
+ dist: dict[str, int] = {}
115
+ for c in self.cards:
116
+ for color in c.colors or ["?"]:
117
+ dist[color] = dist.get(color, 0) + c.quantity
118
+ return dist
119
+
120
+ @property
121
+ def family_match_count(self) -> int:
122
+ return sum(c.quantity for c in self.cards if c.family_match)
123
+
124
+
125
+ def build_deck(
126
+ leader_idx: int,
127
+ cards: list[dict[str, Any]],
128
+ matrix: np.ndarray,
129
+ style: str = "midrange",
130
+ max_copies: int = DEFAULT_MAX_COPIES,
131
+ deck_size: int = DEFAULT_DECK_SIZE,
132
+ ) -> Deck:
133
+ if style not in COST_CURVES:
134
+ raise ValueError(
135
+ f"Unknown style {style!r}. Available: {sorted(COST_CURVES)}"
136
+ )
137
+
138
+ leader = cards[leader_idx]
139
+ if leader.get("card_type") != "Leader":
140
+ raise ValueError(
141
+ f"Card at index {leader_idx} ({leader.get('id')!r}) is not a Leader"
142
+ )
143
+
144
+ # Pull every color-legal candidate (synergy-ranked, leaders/self excluded).
145
+ all_hits = recommend_synergy(leader_idx, cards, matrix, k=len(cards))
146
+
147
+ # Group by cost bucket; cost None goes into a separate "no-cost" pile
148
+ # and is only used during backfill (most OPTCG cards have a cost).
149
+ by_bucket: dict[int, list] = defaultdict(list)
150
+ no_cost: list = []
151
+ for hit in all_hits:
152
+ if hit.cost is None:
153
+ no_cost.append(hit)
154
+ else:
155
+ by_bucket[min(int(hit.cost), 8)].append(hit)
156
+
157
+ target = COST_CURVES[style]
158
+ deck: list[DeckCard] = []
159
+ copies_used: dict[str, int] = defaultdict(int)
160
+ total = 0
161
+
162
+ # Pass 1: fill each bucket from its top-synergy candidates.
163
+ for cost in sorted(target.keys()):
164
+ want = min(target[cost], deck_size - total)
165
+ taken = 0
166
+ for hit in by_bucket.get(cost, []):
167
+ if taken >= want:
168
+ break
169
+ available = max_copies - copies_used[hit.card_id]
170
+ if available <= 0:
171
+ continue
172
+ qty = min(available, want - taken)
173
+ deck.append(_to_deck_card(hit, qty, cards))
174
+ copies_used[hit.card_id] += qty
175
+ taken += qty
176
+ total += qty
177
+
178
+ # Pass 2: backfill any remainder from the highest-synergy candidates
179
+ # not yet at their copy cap, regardless of cost. This is what makes
180
+ # the size invariant hold even when the target curve is unfillable
181
+ # at exact cost slots.
182
+ if total < deck_size:
183
+ for hit in all_hits:
184
+ if total >= deck_size:
185
+ break
186
+ available = max_copies - copies_used[hit.card_id]
187
+ if available <= 0:
188
+ continue
189
+ qty = min(available, deck_size - total)
190
+ # If we've already added this card, bump its quantity rather
191
+ # than appending a duplicate row.
192
+ existing = next((dc for dc in deck if dc.card_id == hit.card_id), None)
193
+ if existing is None:
194
+ deck.append(_to_deck_card(hit, qty, cards))
195
+ else:
196
+ deck[deck.index(existing)] = _bump_quantity(existing, qty)
197
+ copies_used[hit.card_id] += qty
198
+ total += qty
199
+
200
+ # Sort the final deck by cost (asc), then synergy (desc) for nice display.
201
+ deck.sort(key=lambda dc: (dc.cost if dc.cost is not None else 99, -dc.synergy_score))
202
+
203
+ return Deck(
204
+ leader=leader,
205
+ cards=deck,
206
+ style=style,
207
+ target_curve=dict(target),
208
+ )
209
+
210
+
211
+ def _to_deck_card(hit, qty: int, cards: list[dict[str, Any]]) -> DeckCard:
212
+ # `hit` carries most fields; we look up family/rarity from the source
213
+ # card by id since SynergyHit doesn't carry them.
214
+ full = next((c for c in cards if c.get("id") == hit.card_id), None) or {}
215
+ return DeckCard(
216
+ card_id=hit.card_id,
217
+ name=hit.name,
218
+ quantity=qty,
219
+ cost=hit.cost,
220
+ card_type=hit.card_type,
221
+ colors=hit.colors,
222
+ family=list(full.get("family") or []),
223
+ rarity=str(full.get("rarity") or ""),
224
+ set_code=hit.set_code,
225
+ synergy_score=hit.total_score,
226
+ family_match=hit.family_match,
227
+ )
228
+
229
+
230
+ def _bump_quantity(dc: DeckCard, extra: int) -> DeckCard:
231
+ return DeckCard(
232
+ card_id=dc.card_id,
233
+ name=dc.name,
234
+ quantity=dc.quantity + extra,
235
+ cost=dc.cost,
236
+ card_type=dc.card_type,
237
+ colors=dc.colors,
238
+ family=dc.family,
239
+ rarity=dc.rarity,
240
+ set_code=dc.set_code,
241
+ synergy_score=dc.synergy_score,
242
+ family_match=dc.family_match,
243
+ )
244
+
245
+
246
+ def deck_to_text(deck: Deck) -> str:
247
+ """Plain-text deck list. Format mirrors common OPTCG-Sim conventions:
248
+ one card per line, `<qty>x <ID> <Name>`, plus a summary header.
249
+ """
250
+ lines: list[str] = []
251
+ leader = deck.leader
252
+ lines.append("# OPTCG deck")
253
+ lines.append(f"# Style: {deck.style}")
254
+ lines.append(f"# Total: {deck.total_quantity} cards")
255
+ lines.append(f"# Avg cost: {deck.avg_cost:.2f}")
256
+ lines.append("")
257
+ lines.append("## Leader")
258
+ lines.append(f"1x {leader.get('id', '?')} {leader.get('name', '?')}")
259
+ lines.append("")
260
+ lines.append("## Main deck")
261
+ for dc in deck.cards:
262
+ lines.append(f"{dc.quantity}x {dc.card_id} {dc.name}")
263
+ return "\n".join(lines)
spaceutil/plot.py ADDED
@@ -0,0 +1,112 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Plotly figures for the deck-builder Space.
2
+
3
+ Three breakdown charts displayed alongside a generated deck:
4
+ - cost curve (bar): quantities by cost bucket vs. the target curve
5
+ - type breakdown (bar): Character/Event/Stage counts
6
+ - color breakdown (bar): cards per color
7
+
8
+ Color hex map mirrors the upstream `optcg_cards.visualize._first_color_hex`
9
+ so palettes stay consistent across all OPTCG-related Spaces.
10
+ """
11
+
12
+ from __future__ import annotations
13
+
14
+ from typing import TYPE_CHECKING
15
+
16
+ from optcg_cards.visualize import _first_color_hex
17
+
18
+ if TYPE_CHECKING:
19
+ from spaceutil.deck import Deck
20
+
21
+
22
+ def _empty_fig(message: str, height: int = 240):
23
+ import plotly.graph_objects as go
24
+
25
+ fig = go.Figure()
26
+ fig.add_annotation(
27
+ text=message,
28
+ xref="paper", yref="paper", x=0.5, y=0.5,
29
+ showarrow=False, font=dict(color="gray"),
30
+ )
31
+ fig.update_layout(
32
+ xaxis=dict(visible=False),
33
+ yaxis=dict(visible=False),
34
+ plot_bgcolor="white",
35
+ margin=dict(l=40, r=40, t=40, b=40),
36
+ height=height,
37
+ )
38
+ return fig
39
+
40
+
41
+ def build_cost_curve_figure(deck: Deck | None, height: int = 280):
42
+ """Bar chart comparing the deck's actual cost curve vs. the style target."""
43
+ import plotly.graph_objects as go
44
+
45
+ if deck is None or deck.total_quantity == 0:
46
+ return _empty_fig("Build a deck to see its cost curve.", height=height)
47
+
48
+ actual = deck.cost_distribution
49
+ target = deck.target_curve or {}
50
+ buckets = sorted(set(actual) | set(target))
51
+ labels = [("8+" if b == 8 else str(b)) for b in buckets]
52
+ actual_y = [actual.get(b, 0) for b in buckets]
53
+ target_y = [target.get(b, 0) for b in buckets]
54
+
55
+ fig = go.Figure(data=[
56
+ go.Bar(name="Actual", x=labels, y=actual_y, marker_color="#dc3545"),
57
+ go.Bar(name="Target", x=labels, y=target_y, marker_color="#1f77b4", opacity=0.5),
58
+ ])
59
+ fig.update_layout(
60
+ title="Cost curve: actual vs. target",
61
+ xaxis=dict(title="Cost"),
62
+ yaxis=dict(title="Cards"),
63
+ barmode="group",
64
+ plot_bgcolor="white",
65
+ margin=dict(l=40, r=40, t=60, b=40),
66
+ height=height,
67
+ legend=dict(orientation="h", y=1.0, yanchor="bottom"),
68
+ )
69
+ return fig
70
+
71
+
72
+ def build_type_breakdown_figure(deck: Deck | None, height: int = 240):
73
+ import plotly.graph_objects as go
74
+
75
+ if deck is None or deck.total_quantity == 0:
76
+ return _empty_fig("Build a deck to see its type mix.", height=height)
77
+
78
+ dist = deck.type_distribution
79
+ types = sorted(dist.keys())
80
+ counts = [dist[t] for t in types]
81
+ fig = go.Figure(data=[go.Bar(x=types, y=counts, marker_color="#6c757d")])
82
+ fig.update_layout(
83
+ title="Card type mix",
84
+ xaxis=dict(title=""),
85
+ yaxis=dict(title="Cards"),
86
+ plot_bgcolor="white",
87
+ margin=dict(l=40, r=40, t=60, b=40),
88
+ height=height,
89
+ )
90
+ return fig
91
+
92
+
93
+ def build_color_breakdown_figure(deck: Deck | None, height: int = 240):
94
+ import plotly.graph_objects as go
95
+
96
+ if deck is None or deck.total_quantity == 0:
97
+ return _empty_fig("Build a deck to see its color mix.", height=height)
98
+
99
+ dist = deck.color_distribution
100
+ colors = sorted(dist.keys())
101
+ counts = [dist[c] for c in colors]
102
+ bar_colors = [_first_color_hex([c]) for c in colors]
103
+ fig = go.Figure(data=[go.Bar(x=colors, y=counts, marker_color=bar_colors)])
104
+ fig.update_layout(
105
+ title="Color mix (per copy)",
106
+ xaxis=dict(title=""),
107
+ yaxis=dict(title="Cards (counted per color of multicolor cards)"),
108
+ plot_bgcolor="white",
109
+ margin=dict(l=40, r=40, t=60, b=40),
110
+ height=height,
111
+ )
112
+ return fig
spaceutil/synergy.py ADDED
@@ -0,0 +1,126 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Synergy recommendations anchored on a Leader card.
2
+
3
+ Synergy is not the same as raw cosine similarity. The "Browse / find
4
+ similar" tab gives you mechanically-similar cards (good for swap
5
+ candidates), but two near-identical "low cost red blockers" both
6
+ compete for the same deck slot. Synergy here means: cards that are
7
+ *color-legal* under the chosen leader and that *amplify the leader's
8
+ strategy*, with a small thumb on the scale for cards in the leader's
9
+ family or archetype.
10
+
11
+ Score = cosine_similarity(leader, card) + family_bonus(card)
12
+
13
+ - color_overlap(leader, card) is required (filtered out otherwise).
14
+ OPTCG decks must share at least one color with the leader.
15
+ - family_overlap(leader, card) adds a fixed bonus (default 0.10) -
16
+ enough to outrank a marginally-more-similar but off-archetype card,
17
+ not so much that it drowns the embedding signal.
18
+ - Other Leader cards are excluded - decks have exactly one Leader.
19
+ - The leader itself is excluded.
20
+ """
21
+
22
+ from __future__ import annotations
23
+
24
+ from collections.abc import Iterable
25
+ from dataclasses import dataclass
26
+ from typing import Any
27
+
28
+ import numpy as np
29
+
30
+ FAMILY_BONUS = 0.10
31
+
32
+
33
+ @dataclass(frozen=True)
34
+ class SynergyHit:
35
+ rank: int
36
+ card_id: str
37
+ name: str
38
+ base_score: float
39
+ family_bonus: float
40
+ total_score: float
41
+ family_match: bool
42
+ card_type: str
43
+ colors: list[str]
44
+ cost: int | None
45
+ set_code: str
46
+
47
+
48
+ def color_overlap(a: Iterable[str] | None, b: Iterable[str] | None) -> bool:
49
+ if not a or not b:
50
+ return False
51
+ return bool(set(a) & set(b))
52
+
53
+
54
+ def family_overlap(a: Iterable[str] | None, b: Iterable[str] | None) -> bool:
55
+ if not a or not b:
56
+ return False
57
+ return bool(set(a) & set(b))
58
+
59
+
60
+ def recommend_synergy(
61
+ leader_idx: int,
62
+ cards: list[dict[str, Any]],
63
+ matrix: np.ndarray,
64
+ k: int = 30,
65
+ family_bonus: float = FAMILY_BONUS,
66
+ ) -> list[SynergyHit]:
67
+ leader = cards[leader_idx]
68
+ if leader.get("card_type") != "Leader":
69
+ raise ValueError(
70
+ f"Card at index {leader_idx} ({leader.get('id')!r}) is not a Leader"
71
+ )
72
+
73
+ leader_colors = leader.get("colors") or []
74
+ leader_family = leader.get("family") or []
75
+
76
+ leader_vec = matrix[leader_idx]
77
+ base_scores = matrix @ leader_vec # (N,)
78
+
79
+ candidates: list[tuple[int, float, float, bool]] = []
80
+ for idx, card in enumerate(cards):
81
+ if idx == leader_idx:
82
+ continue
83
+ if card.get("card_type") == "Leader":
84
+ continue
85
+ if not color_overlap(leader_colors, card.get("colors")):
86
+ continue
87
+
88
+ base = float(base_scores[idx])
89
+ f_match = family_overlap(leader_family, card.get("family"))
90
+ bonus = family_bonus if f_match else 0.0
91
+ candidates.append((idx, base, bonus, f_match))
92
+
93
+ candidates.sort(key=lambda x: -(x[1] + x[2]))
94
+
95
+ hits: list[SynergyHit] = []
96
+ for rank, (idx, base, bonus, f_match) in enumerate(candidates[:k], start=1):
97
+ card = cards[idx]
98
+ cost = card.get("cost")
99
+ if isinstance(cost, float):
100
+ cost = int(cost) if not np.isnan(cost) else None
101
+ hits.append(
102
+ SynergyHit(
103
+ rank=rank,
104
+ card_id=str(card.get("id", "")),
105
+ name=str(card.get("name", "")),
106
+ base_score=base,
107
+ family_bonus=bonus,
108
+ total_score=base + bonus,
109
+ family_match=f_match,
110
+ card_type=str(card.get("card_type", "")),
111
+ colors=list(card.get("colors") or []),
112
+ cost=cost,
113
+ set_code=str(card.get("set_code", "")),
114
+ )
115
+ )
116
+ return hits
117
+
118
+
119
+ def cost_curve(hits: list[SynergyHit], max_cost: int = 10) -> dict[int, int]:
120
+ counts: dict[int, int] = {}
121
+ for h in hits:
122
+ if h.cost is None:
123
+ continue
124
+ c = min(int(h.cost), max_cost)
125
+ counts[c] = counts.get(c, 0) + 1
126
+ return counts
tests/__init__.py ADDED
File without changes
tests/conftest.py ADDED
@@ -0,0 +1,174 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Test fixtures for the deck-builder Space.
2
+
3
+ 80 synthetic cards with 1024-dim L2-normalized embeddings. The volume
4
+ matters here: a 50-card deck with up to 4 copies per card needs at
5
+ minimum 13 unique candidates per cost bucket to fill cleanly. 80 cards
6
+ across 4 types and 8 color groupings gives enough variety for
7
+ deck-builder tests to exercise both the cost-curve targeting and the
8
+ backfill paths.
9
+ """
10
+
11
+ from __future__ import annotations
12
+
13
+ from pathlib import Path
14
+ from typing import Any
15
+
16
+ import numpy as np
17
+ import pandas as pd
18
+ import pytest
19
+ from optcg_cards.provenance import (
20
+ EmbedProvenance,
21
+ FetchProvenance,
22
+ write_provenance,
23
+ )
24
+
25
+ EMBEDDING_DIM = 1024
26
+ N_CARDS = 200
27
+
28
+ # 6 base colors + 6 adjacent bi-color combos. Bi-colors widen the
29
+ # candidate pool for any chosen leader (a Red leader can also draft
30
+ # Red/Green and Red/Black cards), so the deck builder has room to fill
31
+ # 50 slots at <=4 copies each.
32
+ _COLORS_POOL = [
33
+ ["Red"], ["Green"], ["Blue"], ["Purple"], ["Black"], ["Yellow"],
34
+ ["Red", "Green"], ["Green", "Blue"], ["Blue", "Purple"],
35
+ ["Purple", "Black"], ["Black", "Yellow"], ["Yellow", "Red"],
36
+ ]
37
+ _CARD_TYPES = ["Character", "Event", "Stage", "Leader"]
38
+ _RARITIES = ["C", "UC", "R", "SR", "L"]
39
+ _FAMILIES = [
40
+ ["Straw Hat Crew"],
41
+ ["Animal Kingdom Pirates"],
42
+ ["Marines"],
43
+ ["Worst Generation"],
44
+ ["Big Mom Pirates"],
45
+ ]
46
+
47
+
48
+ def _color_for(i: int) -> list[str]:
49
+ # 5 is coprime with 12, so type (i%4) and color cycles never align.
50
+ return _COLORS_POOL[(i * 5 + 1) % len(_COLORS_POOL)]
51
+
52
+
53
+ def _unit_vector(rng: np.random.Generator, dim: int) -> list[float]:
54
+ v = rng.standard_normal(dim).astype(np.float32)
55
+ v /= np.linalg.norm(v)
56
+ return v.tolist()
57
+
58
+
59
+ @pytest.fixture
60
+ def synthetic_cards() -> list[dict[str, Any]]:
61
+ rng = np.random.default_rng(seed=42)
62
+ cards: list[dict[str, Any]] = []
63
+ for i in range(N_CARDS):
64
+ ctype = _CARD_TYPES[i % len(_CARD_TYPES)]
65
+ cards.append(
66
+ {
67
+ "id": f"OP01-{i:03d}",
68
+ "code": f"OP01-{i:03d}",
69
+ "name": f"Card {i}",
70
+ "card_type": ctype,
71
+ "colors": _color_for(i),
72
+ # Spread costs 1-9 with a few stages at None
73
+ "cost": None if (ctype == "Stage" and i % 8 == 3) else (1 + i % 9),
74
+ "power": 1000 * (1 + i % 9),
75
+ "counter": (i % 3) * 1000 if (i % 3) else None,
76
+ "life": 5 if ctype == "Leader" else None,
77
+ "attribute": "Slash" if i % 2 else "Strike",
78
+ "family": _FAMILIES[i % len(_FAMILIES)],
79
+ "effect_text": f"Effect for card {i}.",
80
+ "trigger_text": "",
81
+ "rarity": _RARITIES[i % len(_RARITIES)],
82
+ "pack_id": "OP01",
83
+ "set_code": "OP01",
84
+ "set_name": "Romance Dawn",
85
+ "language": "en",
86
+ "umap_x": float(rng.uniform(-10, 10)),
87
+ "umap_y": float(rng.uniform(-10, 10)),
88
+ "embedding": _unit_vector(rng, EMBEDDING_DIM),
89
+ }
90
+ )
91
+ return cards
92
+
93
+
94
+ @pytest.fixture
95
+ def synthetic_embed_provenance() -> EmbedProvenance:
96
+ return EmbedProvenance(
97
+ model_id="Qwen/Qwen3-Embedding-0.6B",
98
+ embedding_dim=EMBEDDING_DIM,
99
+ matryoshka_dim=None,
100
+ task_instruction=(
101
+ "Instruct: Represent this One Piece Card Game card so that "
102
+ "mechanically similar cards are close in embedding space.\n"
103
+ "Text: {card_document}"
104
+ ),
105
+ embedded_at="2026-05-14T00:00:00+00:00",
106
+ sentence_transformers_version="5.4.1",
107
+ )
108
+
109
+
110
+ @pytest.fixture
111
+ def synthetic_fetch_provenance() -> FetchProvenance:
112
+ return FetchProvenance(
113
+ source="vegapull",
114
+ source_url="https://en.onepiece-cardgame.com/cardlist/",
115
+ source_attribution="vegapull scraping en.onepiece-cardgame.com",
116
+ source_fetched_at="2026-05-14T00:00:00+00:00",
117
+ language="en",
118
+ n_cards=N_CARDS,
119
+ pack_ids_included=["OP01"],
120
+ latest_pack_id="OP01",
121
+ vegapull_version="1.2.2",
122
+ )
123
+
124
+
125
+ @pytest.fixture
126
+ def synthetic_repo(
127
+ tmp_path: Path,
128
+ synthetic_cards: list[dict[str, Any]],
129
+ synthetic_fetch_provenance: FetchProvenance,
130
+ synthetic_embed_provenance: EmbedProvenance,
131
+ ) -> dict[str, Path]:
132
+ parquet_path = tmp_path / "cards_with_embeddings.parquet"
133
+ pd.DataFrame(synthetic_cards).to_parquet(parquet_path, index=False)
134
+ prov_path = tmp_path / "provenance.json"
135
+ write_provenance(
136
+ prov_path,
137
+ fetch=synthetic_fetch_provenance,
138
+ embed=synthetic_embed_provenance,
139
+ )
140
+ return {"parquet": parquet_path, "provenance": prov_path, "root": tmp_path}
141
+
142
+
143
+ @pytest.fixture
144
+ def patched_hf_download(
145
+ monkeypatch: pytest.MonkeyPatch,
146
+ synthetic_repo: dict[str, Path],
147
+ ):
148
+ """Patch huggingface_hub.hf_hub_download so spaceutil.data.load_corpus
149
+ pulls from the local synthetic_repo instead of the network."""
150
+
151
+ def fake_download(
152
+ repo_id: str,
153
+ filename: str,
154
+ repo_type: str | None = None,
155
+ token: str | None = None,
156
+ **kwargs: Any,
157
+ ) -> str:
158
+ if filename == "cards_with_embeddings.parquet":
159
+ return str(synthetic_repo["parquet"])
160
+ if filename == "provenance.json":
161
+ return str(synthetic_repo["provenance"])
162
+ raise FileNotFoundError(f"Unexpected filename in synthetic repo: {filename}")
163
+
164
+ import huggingface_hub
165
+
166
+ monkeypatch.setattr(huggingface_hub, "hf_hub_download", fake_download)
167
+ try:
168
+ import spaceutil.data as data_mod
169
+
170
+ monkeypatch.setattr(data_mod, "hf_hub_download", fake_download, raising=False)
171
+ except ImportError:
172
+ pass
173
+
174
+ return fake_download
tests/test_data.py ADDED
@@ -0,0 +1,112 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """TDD for spaceutil.data.load_corpus."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import logging
6
+
7
+ import numpy as np
8
+ from optcg_cards.provenance import EmbedProvenance
9
+
10
+
11
+ def test_load_corpus_returns_expected_shape(patched_hf_download):
12
+ from spaceutil.data import load_corpus
13
+
14
+ cards, matrix, embed_prov, id_to_idx = load_corpus(token="fake-token")
15
+
16
+ assert isinstance(cards, list)
17
+ assert len(cards) == 200
18
+ assert isinstance(matrix, np.ndarray)
19
+ assert matrix.shape == (200, 1024)
20
+ assert matrix.dtype == np.float32
21
+ assert isinstance(embed_prov, EmbedProvenance)
22
+ assert isinstance(id_to_idx, dict)
23
+ assert len(id_to_idx) == 200
24
+
25
+
26
+ def test_embedding_key_dropped_from_cards(patched_hf_download):
27
+ from spaceutil.data import load_corpus
28
+
29
+ cards, _, _, _ = load_corpus(token="fake-token")
30
+
31
+ for card in cards:
32
+ assert "embedding" not in card, "embedding column must be stripped after stacking"
33
+
34
+
35
+ def test_list_columns_coerced_to_python_lists(patched_hf_download):
36
+ from spaceutil.data import load_corpus
37
+
38
+ cards, _, _, _ = load_corpus(token="fake-token")
39
+
40
+ for card in cards:
41
+ assert isinstance(card["colors"], list), "colors must be list, not ndarray"
42
+ assert not isinstance(card["colors"], np.ndarray)
43
+ if card["family"] is not None:
44
+ assert isinstance(card["family"], list)
45
+ assert not isinstance(card["family"], np.ndarray)
46
+
47
+
48
+ def test_id_to_idx_consistency(patched_hf_download):
49
+ from spaceutil.data import load_corpus
50
+
51
+ cards, matrix, _, id_to_idx = load_corpus(token="fake-token")
52
+
53
+ for card in cards:
54
+ idx = id_to_idx[card["id"]]
55
+ assert cards[idx]["id"] == card["id"]
56
+ assert matrix[idx].shape == (1024,)
57
+
58
+
59
+ def test_provenance_recovered(patched_hf_download):
60
+ from spaceutil.data import load_corpus
61
+
62
+ _, _, embed_prov, _ = load_corpus(token="fake-token")
63
+
64
+ assert embed_prov.model_id == "Qwen/Qwen3-Embedding-0.6B"
65
+ assert embed_prov.embedding_dim == 1024
66
+ assert "Instruct" in embed_prov.task_instruction
67
+ assert "{card_document}" in embed_prov.task_instruction
68
+
69
+
70
+ def test_no_image_url_columns_exposed(patched_hf_download):
71
+ """CLAUDE.md hard rule: no image/url/art columns."""
72
+ from spaceutil.data import load_corpus
73
+
74
+ cards, _, _, _ = load_corpus(token="fake-token")
75
+
76
+ forbidden_substrings = ("image", "art_url", "thumbnail", "img_")
77
+ for card in cards:
78
+ for key in card:
79
+ for sub in forbidden_substrings:
80
+ assert sub not in key.lower(), f"forbidden column {key!r}"
81
+
82
+
83
+ def test_token_never_logged(patched_hf_download, caplog):
84
+ """HF_TOKEN must not appear in captured logs."""
85
+ from spaceutil.data import load_corpus
86
+
87
+ secret = "hf_super_secret_token_12345"
88
+ with caplog.at_level(logging.DEBUG):
89
+ load_corpus(token=secret)
90
+
91
+ for record in caplog.records:
92
+ assert secret not in record.getMessage()
93
+ assert secret not in str(record.args or "")
94
+
95
+
96
+ def test_matrix_is_l2_normalized(patched_hf_download):
97
+ """Synthetic vectors are pre-normalized; load_corpus must preserve that."""
98
+ from spaceutil.data import load_corpus
99
+
100
+ _, matrix, _, _ = load_corpus(token="fake-token")
101
+
102
+ norms = np.linalg.norm(matrix, axis=1)
103
+ np.testing.assert_allclose(norms, 1.0, atol=1e-5)
104
+
105
+
106
+ def test_load_corpus_accepts_none_token(patched_hf_download):
107
+ """After the HF repo is flipped public, token becomes optional."""
108
+ from spaceutil.data import load_corpus
109
+
110
+ cards, matrix, embed_prov, id_to_idx = load_corpus(token=None)
111
+ assert len(cards) == 200
112
+ assert matrix.shape == (200, 1024)
tests/test_deck.py ADDED
@@ -0,0 +1,222 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """TDD for spaceutil.deck.build_deck.
2
+
3
+ Hard invariants the builder must always satisfy:
4
+ - Total quantity is exactly 50.
5
+ - Every card shares at least one color with the leader.
6
+ - No card has more than `max_copies` (default 4).
7
+ - The leader itself is never in the main deck.
8
+ - No other Leader cards are in the main deck.
9
+
10
+ Soft invariants (style behaviour):
11
+ - Aggro decks have lower average cost than midrange.
12
+ - Control decks have higher average cost than midrange.
13
+ """
14
+
15
+ from __future__ import annotations
16
+
17
+ import numpy as np
18
+ import pytest
19
+
20
+
21
+ def _matrix(cards):
22
+ return np.stack(
23
+ [np.asarray(c["embedding"], dtype=np.float32) for c in cards], axis=0
24
+ )
25
+
26
+
27
+ def _strip_emb(cards):
28
+ return [{k: v for k, v in c.items() if k != "embedding"} for c in cards]
29
+
30
+
31
+ def _first_leader_idx(cards):
32
+ return next(i for i, c in enumerate(cards) if c["card_type"] == "Leader")
33
+
34
+
35
+ class TestDeckSize:
36
+ def test_total_quantity_is_50(self, synthetic_cards):
37
+ from spaceutil.deck import build_deck
38
+
39
+ idx = _first_leader_idx(synthetic_cards)
40
+ matrix = _matrix(synthetic_cards)
41
+ cards = _strip_emb(synthetic_cards)
42
+ deck = build_deck(idx, cards, matrix, style="midrange")
43
+ assert deck.total_quantity == 50
44
+
45
+ def test_total_quantity_is_50_for_each_style(self, synthetic_cards):
46
+ from spaceutil.deck import build_deck
47
+
48
+ idx = _first_leader_idx(synthetic_cards)
49
+ matrix = _matrix(synthetic_cards)
50
+ cards = _strip_emb(synthetic_cards)
51
+ for style in ("aggro", "midrange", "control"):
52
+ deck = build_deck(idx, cards, matrix, style=style)
53
+ assert deck.total_quantity == 50, f"{style} produced {deck.total_quantity}"
54
+
55
+
56
+ class TestColorLegality:
57
+ def test_all_cards_share_a_color_with_leader(self, synthetic_cards):
58
+ from spaceutil.deck import build_deck
59
+
60
+ idx = _first_leader_idx(synthetic_cards)
61
+ leader_colors = set(synthetic_cards[idx]["colors"])
62
+ matrix = _matrix(synthetic_cards)
63
+ cards = _strip_emb(synthetic_cards)
64
+ deck = build_deck(idx, cards, matrix, style="midrange")
65
+
66
+ for dc in deck.cards:
67
+ assert set(dc.colors) & leader_colors, (
68
+ f"{dc.card_id} has {dc.colors}, leader has {leader_colors}"
69
+ )
70
+
71
+
72
+ class TestCopyLimit:
73
+ def test_no_card_exceeds_max_copies_default(self, synthetic_cards):
74
+ from spaceutil.deck import build_deck
75
+
76
+ idx = _first_leader_idx(synthetic_cards)
77
+ matrix = _matrix(synthetic_cards)
78
+ cards = _strip_emb(synthetic_cards)
79
+ deck = build_deck(idx, cards, matrix, style="midrange")
80
+
81
+ for dc in deck.cards:
82
+ assert 1 <= dc.quantity <= 4
83
+
84
+ def test_no_card_exceeds_max_copies_explicit(self, synthetic_cards):
85
+ from spaceutil.deck import build_deck
86
+
87
+ idx = _first_leader_idx(synthetic_cards)
88
+ matrix = _matrix(synthetic_cards)
89
+ cards = _strip_emb(synthetic_cards)
90
+ deck = build_deck(idx, cards, matrix, style="midrange", max_copies=2)
91
+
92
+ assert deck.total_quantity == 50
93
+ for dc in deck.cards:
94
+ assert 1 <= dc.quantity <= 2
95
+
96
+
97
+ class TestLeaderExclusion:
98
+ def test_other_leaders_not_in_deck(self, synthetic_cards):
99
+ from spaceutil.deck import build_deck
100
+
101
+ idx = _first_leader_idx(synthetic_cards)
102
+ matrix = _matrix(synthetic_cards)
103
+ cards = _strip_emb(synthetic_cards)
104
+ deck = build_deck(idx, cards, matrix, style="midrange")
105
+
106
+ for dc in deck.cards:
107
+ assert dc.card_type != "Leader"
108
+
109
+ def test_chosen_leader_not_in_main_deck(self, synthetic_cards):
110
+ from spaceutil.deck import build_deck
111
+
112
+ idx = _first_leader_idx(synthetic_cards)
113
+ leader_id = synthetic_cards[idx]["id"]
114
+ matrix = _matrix(synthetic_cards)
115
+ cards = _strip_emb(synthetic_cards)
116
+ deck = build_deck(idx, cards, matrix, style="midrange")
117
+
118
+ for dc in deck.cards:
119
+ assert dc.card_id != leader_id
120
+
121
+ def test_raises_when_index_is_not_leader(self, synthetic_cards):
122
+ from spaceutil.deck import build_deck
123
+
124
+ non_leader_idx = next(
125
+ i for i, c in enumerate(synthetic_cards) if c["card_type"] != "Leader"
126
+ )
127
+ matrix = _matrix(synthetic_cards)
128
+ cards = _strip_emb(synthetic_cards)
129
+ with pytest.raises(ValueError, match="not a Leader"):
130
+ build_deck(non_leader_idx, cards, matrix)
131
+
132
+
133
+ class TestDeckMetadata:
134
+ def test_deck_carries_leader_reference(self, synthetic_cards):
135
+ from spaceutil.deck import build_deck
136
+
137
+ idx = _first_leader_idx(synthetic_cards)
138
+ leader_id = synthetic_cards[idx]["id"]
139
+ matrix = _matrix(synthetic_cards)
140
+ cards = _strip_emb(synthetic_cards)
141
+ deck = build_deck(idx, cards, matrix, style="midrange")
142
+
143
+ assert deck.leader["id"] == leader_id
144
+ assert deck.style == "midrange"
145
+
146
+ def test_avg_cost_computed(self, synthetic_cards):
147
+ from spaceutil.deck import build_deck
148
+
149
+ idx = _first_leader_idx(synthetic_cards)
150
+ matrix = _matrix(synthetic_cards)
151
+ cards = _strip_emb(synthetic_cards)
152
+ deck = build_deck(idx, cards, matrix, style="midrange")
153
+
154
+ assert deck.avg_cost > 0
155
+ # 1-9 cost spread, midrange should land somewhere reasonable
156
+ assert 1.5 < deck.avg_cost < 8.0
157
+
158
+
159
+ class TestStyleSensitivity:
160
+ def test_aggro_cheaper_than_midrange(self, synthetic_cards):
161
+ from spaceutil.deck import build_deck
162
+
163
+ idx = _first_leader_idx(synthetic_cards)
164
+ matrix = _matrix(synthetic_cards)
165
+ cards = _strip_emb(synthetic_cards)
166
+ aggro = build_deck(idx, cards, matrix, style="aggro")
167
+ midrange = build_deck(idx, cards, matrix, style="midrange")
168
+
169
+ assert aggro.avg_cost < midrange.avg_cost, (
170
+ f"aggro avg {aggro.avg_cost:.2f} not < midrange {midrange.avg_cost:.2f}"
171
+ )
172
+
173
+ def test_control_pricier_than_midrange(self, synthetic_cards):
174
+ from spaceutil.deck import build_deck
175
+
176
+ idx = _first_leader_idx(synthetic_cards)
177
+ matrix = _matrix(synthetic_cards)
178
+ cards = _strip_emb(synthetic_cards)
179
+ midrange = build_deck(idx, cards, matrix, style="midrange")
180
+ control = build_deck(idx, cards, matrix, style="control")
181
+
182
+ assert control.avg_cost > midrange.avg_cost, (
183
+ f"control avg {control.avg_cost:.2f} not > midrange {midrange.avg_cost:.2f}"
184
+ )
185
+
186
+ def test_unknown_style_raises(self, synthetic_cards):
187
+ from spaceutil.deck import build_deck
188
+
189
+ idx = _first_leader_idx(synthetic_cards)
190
+ matrix = _matrix(synthetic_cards)
191
+ cards = _strip_emb(synthetic_cards)
192
+ with pytest.raises(ValueError, match="style"):
193
+ build_deck(idx, cards, matrix, style="bogus")
194
+
195
+
196
+ class TestExport:
197
+ def test_to_text_format(self, synthetic_cards):
198
+ from spaceutil.deck import build_deck, deck_to_text
199
+
200
+ idx = _first_leader_idx(synthetic_cards)
201
+ matrix = _matrix(synthetic_cards)
202
+ cards = _strip_emb(synthetic_cards)
203
+ deck = build_deck(idx, cards, matrix, style="midrange")
204
+ text = deck_to_text(deck)
205
+
206
+ assert deck.leader["id"] in text
207
+ assert deck.leader["name"] in text
208
+ for dc in deck.cards:
209
+ assert f"{dc.quantity}x {dc.card_id}" in text
210
+ # Sanity: at least one section header so it's human-readable
211
+ assert "Leader" in text and "Main deck" in text
212
+
213
+ def test_to_text_total_quantity_in_summary(self, synthetic_cards):
214
+ from spaceutil.deck import build_deck, deck_to_text
215
+
216
+ idx = _first_leader_idx(synthetic_cards)
217
+ matrix = _matrix(synthetic_cards)
218
+ cards = _strip_emb(synthetic_cards)
219
+ deck = build_deck(idx, cards, matrix, style="midrange")
220
+ text = deck_to_text(deck)
221
+
222
+ assert "50" in text
tests/test_plot.py ADDED
@@ -0,0 +1,78 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """TDD for spaceutil.plot - the deck-builder figures."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import numpy as np
6
+
7
+
8
+ def _matrix(cards):
9
+ return np.stack(
10
+ [np.asarray(c["embedding"], dtype=np.float32) for c in cards], axis=0
11
+ )
12
+
13
+
14
+ def _strip_emb(cards):
15
+ return [{k: v for k, v in c.items() if k != "embedding"} for c in cards]
16
+
17
+
18
+ def _build_a_deck(synthetic_cards):
19
+ from spaceutil.deck import build_deck
20
+
21
+ idx = next(i for i, c in enumerate(synthetic_cards) if c["card_type"] == "Leader")
22
+ return build_deck(idx, _strip_emb(synthetic_cards), _matrix(synthetic_cards))
23
+
24
+
25
+ class TestEmptyState:
26
+ def test_cost_curve_with_no_deck(self):
27
+ import plotly.graph_objects as go
28
+
29
+ from spaceutil.plot import build_cost_curve_figure
30
+
31
+ fig = build_cost_curve_figure(None)
32
+ assert isinstance(fig, go.Figure)
33
+ assert len(fig.data) == 0
34
+ assert any("build a deck" in a.text.lower() for a in fig.layout.annotations)
35
+
36
+ def test_type_breakdown_with_no_deck(self):
37
+ from spaceutil.plot import build_type_breakdown_figure
38
+
39
+ fig = build_type_breakdown_figure(None)
40
+ assert len(fig.data) == 0
41
+
42
+ def test_color_breakdown_with_no_deck(self):
43
+ from spaceutil.plot import build_color_breakdown_figure
44
+
45
+ fig = build_color_breakdown_figure(None)
46
+ assert len(fig.data) == 0
47
+
48
+
49
+ class TestBuiltDeck:
50
+ def test_cost_curve_has_two_bars(self, synthetic_cards):
51
+ from spaceutil.plot import build_cost_curve_figure
52
+
53
+ deck = _build_a_deck(synthetic_cards)
54
+ fig = build_cost_curve_figure(deck)
55
+ # actual + target traces
56
+ assert len(fig.data) == 2
57
+ names = {trace.name for trace in fig.data}
58
+ assert names == {"Actual", "Target"}
59
+
60
+ def test_type_breakdown_returns_one_trace(self, synthetic_cards):
61
+ from spaceutil.plot import build_type_breakdown_figure
62
+
63
+ deck = _build_a_deck(synthetic_cards)
64
+ fig = build_type_breakdown_figure(deck)
65
+ assert len(fig.data) == 1
66
+ # Should include at least one type
67
+ assert sum(fig.data[0].y) > 0
68
+
69
+ def test_color_breakdown_includes_leader_color(self, synthetic_cards):
70
+ from spaceutil.plot import build_color_breakdown_figure
71
+
72
+ deck = _build_a_deck(synthetic_cards)
73
+ fig = build_color_breakdown_figure(deck)
74
+ assert len(fig.data) == 1
75
+ # Every color appearing on the chart should appear on at least one card
76
+ on_chart_colors = set(fig.data[0].x)
77
+ actual_colors = set(deck.color_distribution.keys())
78
+ assert on_chart_colors == actual_colors