ClashCR Verification Plan

Goal

Validate the opponent card tracker on real MuMu/BlueStacks gameplay recordings before claiming any accuracy.

Dataset Requirements

At least 20 full battles.
Multiple arenas/maps.
Multiple resolutions (e.g., 1280x720, 1920x1080, emulator native).
Single/double/triple elixir periods.
Normal cards, spells, buildings, champions, evolutions, heroes, tower troops.
Negative recordings: lobby/menu screens, quiet battle periods with no opponent plays.

Labeling Format

CSV with columns:

timestamp (float, seconds)
frame_idx (int)
side ('opponent', 'own', 'unknown')
card_key (string, normalized card name)
confidence (float, 1.0 for manual labels)
manual_note (string, optional)
source ('manual')

Metrics

Precision: correct_predictions / total_predictions
Recall: correct_predictions / total_labels
F1: harmonic mean of precision and recall
False Positives per Minute: FP / recording_duration * 60
Missed Events: false negatives
Mean Timing Error: average |pred_timestamp - label_timestamp| for matched pairs
Median Timing Error: median of above
Confusion Matrix: per-card breakdown

Acceptance Targets

0 lobby/menu false positives.
0 random repeated spam when no card is played.
False positives per minute near 0 on negative/quiet recordings.
Every emitted event must include raw visual evidence.
100% accuracy may only be claimed on held-out labeled recordings if every opponent card event is detected within the allowed time window and no false events are emitted.
If true 100% is not achievable from single-screen public data, state that plainly and identify exactly what additional labeled data or visual signal is required.

Commands

# Record
clashcr record-battle --config config.yaml --output data/live-recordings/session-001 --seconds 180 --fps 8

# Label manually by editing data/live-recordings/session-001/labels.csv

# Evaluate
clashcr evaluate-recording --config config.yaml --recording data/live-recordings/session-heldout --labels data/live-recordings/session-heldout/labels.csv

Known Limitations (Expected)

Spell detection relies on heuristic color signatures; may miss subtle spells.
Hero/evolution detection requires YOLO model trained on those units.
RoyaleAPI static dataset is stale; official API token required for current card list.
Without labeled recordings, no accuracy claims can be made.