MusicQualityModel โ€” A1_frozen_mlp

Multi-head neural evaluator for music generation quality, built on frozen MuQ representations with learned attention pooling and per-dimension MLP prediction heads.

Model Details

  • Encoder: OpenMuQ/MuQ-large-msd-iter (tuning mode: frozen)
  • Pooling: Attention-weighted mean pooling
  • Heads: MI, TA
  • Loss: mse
  • Input: Audio waveform at 24000 Hz, max 10.0s

Performance

Evaluated with 5-fold cross-validation on MusicEval (2,748 clips, 31 TTM systems).

Usage

import torch
import torchaudio
from omegaconf import OmegaConf
from huggingface_hub import hf_hub_download

# Download files
config_path = hf_hub_download("zhudi2825/MuQ-Eval-A1", "config.yaml")
model_path = hf_hub_download("zhudi2825/MuQ-Eval-A1", "best_model.pt")

# Load config and build model
cfg = OmegaConf.load(config_path)
from src.model import MusicQualityModel
model = MusicQualityModel(cfg)

ckpt = torch.load(model_path, map_location="cpu", weights_only=False)
model.load_state_dict(ckpt["model_state"])
model.eval()

# Run inference
waveform, sr = torchaudio.load("audio.wav")
if sr != 24000:
    waveform = torchaudio.transforms.Resample(sr, 24000)(waveform)
waveform = waveform.mean(0)  # mono
waveform = waveform[:240000].unsqueeze(0)  # [1, samples]

with torch.no_grad():
    preds = model(waveform)
    scores = model._last_expected_scores
    for name, score in scores.items():
        print(f"{name}: {score.item():.2f}")

Training

  • Dataset: MusicEval (BAAI/MusicEval) โ€” 5-fold stratified CV by TTM model
  • Epochs: 30
  • Batch size: 16
  • Optimizer: AdamW (lr=0.001, wd=0.01)
  • Scheduler: cosine with 500 warmup steps
  • Precision: bf16

Citation

If you use this model, please cite:

@article{zhu2026muqeval,
  title={Frozen Music Representations Suffice for Per-Sample Quality Prediction of Generated Music},
  author={Zhu, Di and Li, Zixuan},
  journal={arXiv preprint arXiv:2603.22677},
  year={2026}
}
Downloads last month
158
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Paper for zhudi2825/MuQ-Eval-A1