MusicQualityModel — A1_frozen_mlp

Multi-head neural evaluator for music generation quality, built on frozen MuQ representations with learned attention pooling and per-dimension MLP prediction heads.

Model Details

Encoder: OpenMuQ/MuQ-large-msd-iter (tuning mode: frozen)
Pooling: Attention-weighted mean pooling
Heads: MI, TA
Loss: mse
Input: Audio waveform at 24000 Hz, max 10.0s

Performance

Evaluated with 5-fold cross-validation on MusicEval (2,748 clips, 31 TTM systems).

Usage

import torch
import torchaudio
from omegaconf import OmegaConf
from huggingface_hub import hf_hub_download

# Download files
config_path = hf_hub_download("zhudi2825/MuQ-Eval-A1", "config.yaml")
model_path = hf_hub_download("zhudi2825/MuQ-Eval-A1", "best_model.pt")

# Load config and build model
cfg = OmegaConf.load(config_path)
from src.model import MusicQualityModel
model = MusicQualityModel(cfg)

ckpt = torch.load(model_path, map_location="cpu", weights_only=False)
model.load_state_dict(ckpt["model_state"])
model.eval()

# Run inference
waveform, sr = torchaudio.load("audio.wav")
if sr != 24000:
    waveform = torchaudio.transforms.Resample(sr, 24000)(waveform)
waveform = waveform.mean(0)  # mono
waveform = waveform[:240000].unsqueeze(0)  # [1, samples]

with torch.no_grad():
    preds = model(waveform)
    scores = model._last_expected_scores
    for name, score in scores.items():
        print(f"{name}: {score.item():.2f}")

Training

Dataset: MusicEval (BAAI/MusicEval) — 5-fold stratified CV by TTM model
Epochs: 30
Batch size: 16
Optimizer: AdamW (lr=0.001, wd=0.01)
Scheduler: cosine with 500 warmup steps
Precision: bf16

Citation

If you use this model, please cite:

@article{zhu2026muqeval,
  title={Frozen Music Representations Suffice for Per-Sample Quality Prediction of Generated Music},
  author={Zhu, Di and Li, Zixuan},
  journal={arXiv preprint arXiv:2603.22677},
  year={2026}
}

Downloads last month: 158

Paper for zhudi2825/MuQ-Eval-A1

MuQ-Eval: An Open-Source Per-Sample Quality Metric for AI Music Generation Evaluation

Paper • 2603.22677 • Published 27 days ago