MIMIC: Melee Imitation Model for Input Cloning

Behavior-cloned Super Smash Bros. Melee bots trained on human Slippi replays. Four character-specific models (Fox, Falco, Captain Falcon, Luigi), each a ~20M-parameter transformer that takes a 256-frame window of game state and outputs controller inputs (main stick, c-stick, shoulder, buttons) at 60 Hz.

Repo: https://github.com/erickfm/MIMIC
Base architecture: HAL's GPTv5Controller (Eric Gu, https://github.com/ericyuegu/hal) — 6-layer causal transformer, 512 d_model, 8 heads, 256-frame context, relative position encoding (Shaw et al.)
MIMIC-specific changes: 7-class button head (distinct TRIG class for airdodge/wavedash, which HAL's 5-class head cannot represent); v2 shard alignment that fixes a subtle gamestate leak in the training targets (see research notes 2026-04-11c); fix for the digital L press bug that prevented all 7-class BC bots from wavedashing until 2026-04-13.
Training data: filtered from erickfm/slippi-public-dataset-v3.7 (~95K Slippi replays).

Per-character checkpoints

Character	Games	Val btn F1	Val main F1	Val loss	Step
Fox	17,319	87.1%	~55%	0.77	55,692

Repo layout

MIMIC/
├── README.md                      # this file
├── fox/
│   ├── model.pt                   # raw PyTorch checkpoint
│   ├── config.json                # ModelConfig (copied from ckpt["config"])
│   ├── metadata.json              # provenance (step, val metrics, notes)
│   ├── mimic_norm.json            # normalization stats
│   ├── controller_combos.json     # 7-class button combo spec
│   ├── cat_maps.json
│   ├── stick_clusters.json
│   └── norm_stats.json
├── falco/      (same layout)
├── cptfalcon/  (same layout)
└── luigi/      (same layout)

Each character directory is self-contained — the JSONs are the exact metadata used during training, copied verbatim from the MIMIC data dir so any inference script can load them without touching the MIMIC repo.

Usage

Clone the MIMIC repo and pull this model:

git clone https://github.com/erickfm/MIMIC.git
cd MIMIC
bash setup.sh  # installs Dolphin, deps, ISO

# Download all four characters
python3 -c "
from huggingface_hub import snapshot_download
snapshot_download('erickfm/MIMIC', local_dir='./hf_checkpoints')
"

Run a character against a level-9 CPU:

python3 tools/play_vs_cpu.py \
  --checkpoint hf_checkpoints/falco/model.pt \
  --dolphin-path ./emulator/squashfs-root/usr/bin/dolphin-emu \
  --iso-path ./melee.iso \
  --data-dir hf_checkpoints/falco \
  --character FALCO --cpu-character FALCO --cpu-level 9 \
  --stage FINAL_DESTINATION

Or play the bot over Slippi Online Direct Connect:

python3 tools/play_netplay.py \
  --checkpoint hf_checkpoints/falco/model.pt \
  --dolphin-path ./emulator/squashfs-root/usr/bin/dolphin-emu \
  --iso-path ./melee.iso \
  --data-dir hf_checkpoints/falco \
  --character FALCO \
  --connect-code YOUR#123

The MIMIC repo also includes a Discord bot frontend (tools/discord_bot.py) that queues direct-connect matches per user. See docs/discord-bot-setup.md.

Architecture

Slippi Frame ──► HALFlatEncoder (Linear 166→512) ──► 512-d per-frame vector
                                                          │
256-frame window ──► + Relative Position Encoding ────────┘
                         │
                    6× Pre-Norm Causal Transformer Blocks (512-d, 8 heads)
                         │
                    Autoregressive Output Heads (with detach)
                         │
              ┌──────────┼──────────┬───────────┐
           shoulder(3) c_stick(9) main_stick(37) buttons(7)

7-class button head

Class	Meaning
0	A
1	B
2	Z
3	JUMP (X or Y)
4	TRIG (digital L or R)
5	A_TRIG (shield grab)
6	NONE

HAL's original 5-class head (A, B, Jump, Z, None) has no TRIG class and structurally cannot execute airdodge, which means HAL-lineage bots cannot wavedash. MIMIC's 7-class encoding plus a fix for decode_and_press (which was silently dropping the digital L press until 2026-04-13) is what enables the wavedashing you'll see in the replays.

Input features

9 numeric features per player (ego + opponent = 18 total): percent, stock, facing, invulnerable, jumps_left, on_ground, shield_strength, position_x, position_y

Plus categorical embeddings: stage(4d), 2× character(12d), 2× action(32d). Plus controller state from the previous frame as a 56-dim one-hot (37 stick + 9 c-stick + 7 button + 3 shoulder).

Total input per frame: 166 dimensions → projected to 512.

Training

Optimizer: AdamW, LR 3e-4, weight decay 0.01, no warmup
LR schedule: CosineAnnealingLR, eta_min 1e-6
Gradient clip: 1.0
Dropout: 0.2
Sequence length: 256 frames (~4.3 seconds)
Mixed precision: BF16 AMP with FP32 upcast for relpos attention (prevents BF16 overflow in the manual Q@K^T + Srel computation)
Batch size: 512 (typically single-GPU on an RTX 5090)
Steps: ~32K for well-represented characters, early-stopped for Luigi
Reaction delay: 0 (v2 shards have target[i] = buttons[i+1], so the default rd=0 matches inference — do NOT use --reaction-delay 1 or --controller-offset with v2 shards)

Known limitations

Character-locked: each model only plays the character it was trained on. No matchup generalization. Training a multi-character model with a character embedding is a natural next step but not done yet.
Fox model is legacy: the Fox checkpoint is from an earlier run that predates the --self-inputs fix. Its val metrics are much lower than the others and it plays slightly worse.
Small-dataset overfitting: Luigi only has 1951 training games after filtering. The _best.pt checkpoint is early-stopped at step 5242 to avoid the val-loss climb. Plays surprisingly well for the data volume.
Edge guarding and recovery weaknesses: the bot doesn't consistently go for off-stage edge guards or execute high-skill recovery mixups.
No matchmaking / Ranked: the Discord bot only joins explicit Direct Connect lobbies. Do NOT adapt it for Slippi Online Unranked or Ranked — the libmelee README explicitly forbids bots on those ladders, and Slippi has not yet opened a "bot account" opt-in system.

Acknowledgments

Eric Gu for HAL, the reference implementation MIMIC is based on. HAL's architecture, tokenization, and training pipeline are the foundation. https://github.com/ericyuegu/hal
Vlad Firoiu and collaborators for libmelee, the Python interface to Dolphin + Slippi. https://github.com/altf4/libmelee
Project Slippi for the Slippi Dolphin fork, replay format, and Direct Connect rollback netplay. https://slippi.gg

License

MIT — see the MIMIC repo's LICENSE file.

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Reinforcement Learning