CloneCharter

CloneCharter is an encoder-decoder Transformer that takes an audio file and generates a playable Clone Hero chart (.chart format). Given a song, it automatically transcribes guitar, bass, or drum notes at any difficulty level.


Model Architecture

                        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  Audio (MP3/OGG/WAV)   β”‚         ENCODER                  β”‚
       β”‚                β”‚                                  β”‚
       β–Ό                β”‚  AudioCNNFrontEnd                β”‚
  Demucs stem           β”‚  (Conv2D Γ— 3, stride-based)      β”‚
  separation            β”‚  [B, 512 mels, T] β†’ [B, T/16, d] β”‚
       β”‚                β”‚           +                      β”‚
       β–Ό                β”‚  ConditioningEncoder             β”‚
  Log-mel               β”‚  7 prefix tokens from metadata   β”‚
  spectrogram           β”‚  (BPM, TS, instrument, difficulty,β”‚
  [512 mels Γ— T]        β”‚   resolution, offset, MERT emb)  β”‚
       β”‚                β”‚           +                      β”‚
  MERT embedding ──────►│  8-layer TransformerEncoder       β”‚
  [768-d]               β”‚  (pre-norm, bidirectional)       β”‚
                        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                       β”‚ enc_out [B, 512, 768]
                        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                        β”‚         DECODER                  β”‚
                        β”‚                                  β”‚
                        β”‚  12-layer autoregressive         β”‚
                        β”‚  TransformerDecoder              β”‚
                        β”‚  (causal self-attn + cross-attn) β”‚
                        β”‚           +                      β”‚
                        β”‚  Output projection (weight-tied  β”‚
                        β”‚  with token embedding)           β”‚
                        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                       β”‚
                                       β–Ό
                              Token sequence
                        (Beat / Pitch / Duration tokens)
                                       β”‚
                                       β–Ό
                               notes.chart file

Key hyperparameters

Parameter Value
d_model 768
Encoder layers 8
Decoder layers 12
Attention heads 12
FFN dim 3 072
Vocabulary size 693
Max encoder length 512 tokens
Max decoder length 2 048 tokens
Mixed precision bf16

Tokenization

The tokenizer (CloneHeroTokenizer) uses a hierarchical beat-based vocabulary:

  • Special tokens: <BOS>, <EOS>, <UNK>, <PAD>
  • Instrument tokens: <Guitar>, <Bass>, <Drums>
  • Difficulty tokens: <Expert>, <Hard>, <Medium>, <Easy>
  • Temporal position: <Minute_N>, <Beat_N>, <Beatshift_N> (sub-beat 1/32 grid)
  • Pitch: <Pitch_N> (guitar/bass, 5-fret buttons 0-4) or <DrumsPitch_N>
  • Duration: <Beatshift_N> (sustain in 1/32 beat units)

Each note is encoded as a 6-token block:

<Beatshift> <NoteType> <Pitch> <Minute> <Beat> <DurationBeatshift>

Audio Processing

  1. Stem separation β€” Demucs v4
    isolates guitar/bass/drums tracks from the full mix.
  2. Log-mel spectrogram β€” 512 mel bands, FFT 4096, hop 1024 @ 44 100 Hz.
    The CNN frontend compresses this to 16Γ— fewer time steps.
  3. MERT embeddings β€” MERT-v1-95M
    global embedding captures harmonic and rhythmic context.

Intended Use

  • Automatic Clone Hero chart generation from any audio file.
  • Supported instruments: Lead Guitar, Rhythm Guitar, Bass Guitar, Drums.
  • Supported difficulties: Expert, Hard, Medium, Easy.

Limitations

  • Performance degrades on heavily distorted or layered mixes.
  • BPM estimation may be inaccurate for tracks with variable tempo.
  • Trained only on songs with 4/4 time signature.

Citation

@misc{clonecharter2026,
  author = {thejorseman},
  title  = {CloneCharter: Automatic Clone Hero Chart Generation},
  year   = {2026},
  url    = {https://huggingface.co/thejorseman/CloneCharter}
}
Downloads last month
26
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support