# ClashCR Verification Plan ## Goal Validate the opponent card tracker on real MuMu/BlueStacks gameplay recordings before claiming any accuracy. ## Dataset Requirements - At least 20 full battles. - Multiple arenas/maps. - Multiple resolutions (e.g., 1280x720, 1920x1080, emulator native). - Single/double/triple elixir periods. - Normal cards, spells, buildings, champions, evolutions, heroes, tower troops. - Negative recordings: lobby/menu screens, quiet battle periods with no opponent plays. ## Labeling Format CSV with columns: - timestamp (float, seconds) - frame_idx (int) - side ('opponent', 'own', 'unknown') - card_key (string, normalized card name) - confidence (float, 1.0 for manual labels) - manual_note (string, optional) - source ('manual') ## Metrics - **Precision**: correct_predictions / total_predictions - **Recall**: correct_predictions / total_labels - **F1**: harmonic mean of precision and recall - **False Positives per Minute**: FP / recording_duration * 60 - **Missed Events**: false negatives - **Mean Timing Error**: average |pred_timestamp - label_timestamp| for matched pairs - **Median Timing Error**: median of above - **Confusion Matrix**: per-card breakdown ## Acceptance Targets - 0 lobby/menu false positives. - 0 random repeated spam when no card is played. - False positives per minute near 0 on negative/quiet recordings. - Every emitted event must include raw visual evidence. - 100% accuracy may only be claimed on held-out labeled recordings if every opponent card event is detected within the allowed time window and no false events are emitted. - If true 100% is not achievable from single-screen public data, state that plainly and identify exactly what additional labeled data or visual signal is required. ## Commands ```bash # Record clashcr record-battle --config config.yaml --output data/live-recordings/session-001 --seconds 180 --fps 8 # Label manually by editing data/live-recordings/session-001/labels.csv # Evaluate clashcr evaluate-recording --config config.yaml --recording data/live-recordings/session-heldout --labels data/live-recordings/session-heldout/labels.csv ``` ## Known Limitations (Expected) - Spell detection relies on heuristic color signatures; may miss subtle spells. - Hero/evolution detection requires YOLO model trained on those units. - RoyaleAPI static dataset is stale; official API token required for current card list. - Without labeled recordings, no accuracy claims can be made.