metadata
library_name: symupe
license: cc-by-nc-sa-4.0
base_model:
- SyMuPe/Aria-MIDI-MLM
datasets:
- SyMuPe/PianoCoRe
tags:
- music
- piano
- midi
- classification
- quality-assessment
- transformer
SyMuPe: MIDI Quality Classifier
MIDI-Quality-Classifier is a model trained to automatically assess the quality of symbolic piano performances. It classifies MIDI files into four distinct categories: score (inexpressive/rendered), high quality, low quality, and corrupted.
Introduced in the paper: PianoCoRe: Combined and Refined Piano MIDI Dataset.
- SyMuPe Repo: https://github.com/ilya16/SyMuPe
- Project Repo: https://github.com/ilya16/PianoCoRe
- Dataset: https://huggingface.co/datasets/SyMuPe/PianoCoRe
Architecture
- Type: Transformer Encoder
- Backbone: 12-layer Transformer (80M parameters) pre-trained on the deduped subset of the Aria-MIDI dataset using a Multi-Mask Language Modeling (mMLM) objective.
- Classification Module: Single layer transformer and a classification head.
- Objective: Sequence Classification (4 classes).
- Inputs (score-agnostic):
Pitch,Velocity,TimeShift,Duration, absoluteTimePosition - Classes:
- Score (S): Rendered or synthesized scores with constant tempo/dynamics.
- High Quality (HQ): Clean expressive performances (recorded or high-fidelity transcriptions).
- Low Quality (LQ): Transcriptions with noticeable noise or minor errors.
- Corrupted (C): Broken files or severely failed transcriptions.
- Training: Trained for 20,000 iterations on created subset of the PianoCoRe dataset as described in the paper.
Quick Start
Before using this model, ensure you have the symupe library installed (pip install -U symupe).
import torch
from symupe import AutoClassifier
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Build Classifier by loading the model and tokenizer directly from the Hub
classifier = AutoClassifier.from_pretrained(
"SyMuPe/MIDI-Quality-Classifier", device=device
)
# model, tokenizer, labels = classifier.model, classifier.tokenizer, classifier.labels
# Classify a MIDI file
result = classifier("performance.mid")
# result is MusicClassificationResult(...) containing:
# - midi, seq, probabilities, prediction, label, all_logits, all_probabilities, all_predictions,
# sequences and window_indices
print(f"Predicted Label: {result.label}")
print(f"Probabilities: {result.probabilities}")
License
The model weights are distributed under the CC-BY-NC-SA 4.0 license.
Citation
If you use this model or the associated dataset in your research, please cite:
@inproceedings{borovik2025symupe,
title = {{SyMuPe: Affective and Controllable Symbolic Music Performance}},
author = {Borovik, Ilya and Gavrilev, Dmitrii and Viro, Vladimir},
year = {2025},
booktitle = {Proceedings of the 33rd ACM International Conference on Multimedia},
pages = {10699--10708},
doi = {10.1145/3746027.3755871}
}
@article{borovik2026pianocore,
title = {{PianoCoRe: Combined and Refined Piano MIDI Dataset}},
author = {Borovik, Ilya},
year = {2026},
journal = {Transactions of the International Society for Music Information Retrieval},
volume = {9},
number = {1},
pages = {144--163},
doi = {10.5334/tismir.333}
}