MuQ-Eval: An Open-Source Per-Sample Quality Metric for AI Music Generation Evaluation
Paper โข 2603.22677 โข Published
Multi-head neural evaluator for music generation quality, built on frozen MuQ representations with learned attention pooling and per-dimension MLP prediction heads.
OpenMuQ/MuQ-large-msd-iter (tuning mode: frozen)mseEvaluated with 5-fold cross-validation on MusicEval (2,748 clips, 31 TTM systems).
import torch
import torchaudio
from omegaconf import OmegaConf
from huggingface_hub import hf_hub_download
# Download files
config_path = hf_hub_download("zhudi2825/MuQ-Eval-A1", "config.yaml")
model_path = hf_hub_download("zhudi2825/MuQ-Eval-A1", "best_model.pt")
# Load config and build model
cfg = OmegaConf.load(config_path)
from src.model import MusicQualityModel
model = MusicQualityModel(cfg)
ckpt = torch.load(model_path, map_location="cpu", weights_only=False)
model.load_state_dict(ckpt["model_state"])
model.eval()
# Run inference
waveform, sr = torchaudio.load("audio.wav")
if sr != 24000:
waveform = torchaudio.transforms.Resample(sr, 24000)(waveform)
waveform = waveform.mean(0) # mono
waveform = waveform[:240000].unsqueeze(0) # [1, samples]
with torch.no_grad():
preds = model(waveform)
scores = model._last_expected_scores
for name, score in scores.items():
print(f"{name}: {score.item():.2f}")
If you use this model, please cite:
@article{zhu2026muqeval,
title={Frozen Music Representations Suffice for Per-Sample Quality Prediction of Generated Music},
author={Zhu, Di and Li, Zixuan},
journal={arXiv preprint arXiv:2603.22677},
year={2026}
}