TheArtist Music Transformer — F2 (Pop 5K Mix)

Jazz-adapted chord model with a 5,000-sequence pop rehearsal buffer. Calibration point that the paper finds is dominated by F3 on every axis.

One of six checkpoints released alongside the paper Empirical Study of Pop and Jazz Mix Ratios for Genre-Adaptive Chord Generation (Lee, 2026). See the collection overview at PearlLeeStudio/TheArtist-MusicTransformer-pop-baseline.

Model summary

Field Value
Architecture Music Transformer with relative positional attention
Parameters 25,661,440
Vocabulary size 351 tokens
Max sequence length 256
d_model / heads / FFN / layers 512 / 8 / 2048 / 8
Fine-tune resumed from Phase 0 pop baseline
Best epoch 4

Training data

All 1,513 jazz training sequences plus 5,000 pop rehearsal sequences (seed 42). Pop:jazz ≈ 3.3:1.

Evaluation (held-out per-genre test sets)

Metric Pop test Jazz test
Top-1 accuracy 84.07% 79.90%
Top-5 accuracy 97.04% 92.14%
Perplexity 1.75 2.33
Δ vs. Phase 0 baseline −0.17 +7.04

F2 is dominated by F3 on every axis. It is released for reproducibility of the saturation curve described in the paper (see paper §6.1, §7.3) but is not the recommended choice for any operating point. Prefer F3 for the balanced setting, F1 for pop-leaning, or F4 for jazz-leaning.

Intended use

Reference checkpoint for replication and saturation-curve analysis. Not recommended as a default for chord-composition workflows.

Usage

import torch
from huggingface_hub import hf_hub_download
from model import MusicTransformer
from tokenizer import ChordTokenizer

ckpt_path = hf_hub_download(
    repo_id="PearlLeeStudio/TheArtist-MusicTransformer-ft-pop67",
    filename="best.pt",
)
tokenizer = ChordTokenizer()
ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=False)

model = MusicTransformer(
    vocab_size=tokenizer.vocab_size,
    d_model=512, n_heads=8, d_ff=2048, n_layers=8,
    max_seq_len=256, dropout=0.0, pad_id=tokenizer.pad_id,
)
model.load_state_dict(ckpt["model_state_dict"])
model.eval()

Training-data licenses

Dataset License
Chordonomicon Public (user-generated)
McGill Billboard CC0
Jazz Harmony Treebank Public
JazzStandards (iReal Pro) Community redistribution
Weimar Jazz Database ODbL
JAAH Research-use public

Citation

Preprint: arXiv:2605.04998.

@misc{lee2026chordmix,
  title         = {Empirical Study of Pop and Jazz Mix Ratios for Genre-Adaptive Chord Generation},
  author        = {Lee, Jinju},
  year          = {2026},
  eprint        = {2605.04998},
  archivePrefix = {arXiv}
}
Downloads last month
14
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for PearlLeeStudio/TheArtist-MusicTransformer-ft-pop67