clashcr / VERIFICATION_PLAN.md
stevenkhan's picture
Upload VERIFICATION_PLAN.md with huggingface_hub
52461a2 verified

ClashCR Verification Plan

Goal

Validate the opponent card tracker on real MuMu/BlueStacks gameplay recordings before claiming any accuracy.

Dataset Requirements

  • At least 20 full battles.
  • Multiple arenas/maps.
  • Multiple resolutions (e.g., 1280x720, 1920x1080, emulator native).
  • Single/double/triple elixir periods.
  • Normal cards, spells, buildings, champions, evolutions, heroes, tower troops.
  • Negative recordings: lobby/menu screens, quiet battle periods with no opponent plays.

Labeling Format

CSV with columns:

  • timestamp (float, seconds)
  • frame_idx (int)
  • side ('opponent', 'own', 'unknown')
  • card_key (string, normalized card name)
  • confidence (float, 1.0 for manual labels)
  • manual_note (string, optional)
  • source ('manual')

Metrics

  • Precision: correct_predictions / total_predictions
  • Recall: correct_predictions / total_labels
  • F1: harmonic mean of precision and recall
  • False Positives per Minute: FP / recording_duration * 60
  • Missed Events: false negatives
  • Mean Timing Error: average |pred_timestamp - label_timestamp| for matched pairs
  • Median Timing Error: median of above
  • Confusion Matrix: per-card breakdown

Acceptance Targets

  • 0 lobby/menu false positives.
  • 0 random repeated spam when no card is played.
  • False positives per minute near 0 on negative/quiet recordings.
  • Every emitted event must include raw visual evidence.
  • 100% accuracy may only be claimed on held-out labeled recordings if every opponent card event is detected within the allowed time window and no false events are emitted.
  • If true 100% is not achievable from single-screen public data, state that plainly and identify exactly what additional labeled data or visual signal is required.

Commands

# Record
clashcr record-battle --config config.yaml --output data/live-recordings/session-001 --seconds 180 --fps 8

# Label manually by editing data/live-recordings/session-001/labels.csv

# Evaluate
clashcr evaluate-recording --config config.yaml --recording data/live-recordings/session-heldout --labels data/live-recordings/session-heldout/labels.csv

Known Limitations (Expected)

  • Spell detection relies on heuristic color signatures; may miss subtle spells.
  • Hero/evolution detection requires YOLO model trained on those units.
  • RoyaleAPI static dataset is stale; official API token required for current card list.
  • Without labeled recordings, no accuracy claims can be made.