ClashCR Verification Plan
Goal
Validate the opponent card tracker on real MuMu/BlueStacks gameplay recordings before claiming any accuracy.
Dataset Requirements
- At least 20 full battles.
- Multiple arenas/maps.
- Multiple resolutions (e.g., 1280x720, 1920x1080, emulator native).
- Single/double/triple elixir periods.
- Normal cards, spells, buildings, champions, evolutions, heroes, tower troops.
- Negative recordings: lobby/menu screens, quiet battle periods with no opponent plays.
Labeling Format
CSV with columns:
- timestamp (float, seconds)
- frame_idx (int)
- side ('opponent', 'own', 'unknown')
- card_key (string, normalized card name)
- confidence (float, 1.0 for manual labels)
- manual_note (string, optional)
- source ('manual')
Metrics
- Precision: correct_predictions / total_predictions
- Recall: correct_predictions / total_labels
- F1: harmonic mean of precision and recall
- False Positives per Minute: FP / recording_duration * 60
- Missed Events: false negatives
- Mean Timing Error: average |pred_timestamp - label_timestamp| for matched pairs
- Median Timing Error: median of above
- Confusion Matrix: per-card breakdown
Acceptance Targets
- 0 lobby/menu false positives.
- 0 random repeated spam when no card is played.
- False positives per minute near 0 on negative/quiet recordings.
- Every emitted event must include raw visual evidence.
- 100% accuracy may only be claimed on held-out labeled recordings if every opponent card event is detected within the allowed time window and no false events are emitted.
- If true 100% is not achievable from single-screen public data, state that plainly and identify exactly what additional labeled data or visual signal is required.
Commands
# Record
clashcr record-battle --config config.yaml --output data/live-recordings/session-001 --seconds 180 --fps 8
# Label manually by editing data/live-recordings/session-001/labels.csv
# Evaluate
clashcr evaluate-recording --config config.yaml --recording data/live-recordings/session-heldout --labels data/live-recordings/session-heldout/labels.csv
Known Limitations (Expected)
- Spell detection relies on heuristic color signatures; may miss subtle spells.
- Hero/evolution detection requires YOLO model trained on those units.
- RoyaleAPI static dataset is stale; official API token required for current card list.
- Without labeled recordings, no accuracy claims can be made.