PredANNpp-Pretrain-Entropy-ctx16-ep10000-seed42

Model description

This repository contains a PredANN++ PyTorch Lightning checkpoint for EEG-based music representation learning and/or song identification.

  • Canonical repository: Shogo-Noguchi/PredANNpp-Pretrain-Entropy-ctx16-ep10000-seed42
  • Checkpoint file: predannpp_pretrain_entropy_ctx16_ep10000_seed42.ckpt
  • Stage: pretrain-only
  • Target / teacher representation: Entropy (ctx16)
  • Architecture: EncoderDecoder multitask
  • Random seed: 42
  • SHA256: 7779e76f065f75b8049861c794de77db9343f452e469398b42f153a17ca89d12

This is a multitask pretraining checkpoint, not an encoder-only finetuned Song ID classifier. It keeps the encoder/decoder components needed for masked teacher-token prediction, plus the auxiliary Song ID pathway used during multitask training.

The repository name intentionally omits NMEDT because this checkpoint is positioned as a general-purpose representation pretraining checkpoint. The card still discloses NMED-T as the training data for provenance.

Capabilities

Masked prediction of MusicGen Entropy token sequences; auxiliary Song ID classification during multitask training. For direct 3-second EEG to Song ID inference, use an EncoderOnly finetuned checkpoint instead.

Input and output

  • Input EEG: 128 channels, 125 Hz, 3-second segments, following the PredANN++ / NMED-T preprocessing pipeline.
  • Output: depends on stage. Pretraining checkpoints expose the multitask pretraining outputs; the full-scratch checkpoint outputs 10-class Song ID logits.

Training data

  • Dataset: NMED-T (Naturalistic Music EEG Dataset – Tempo), 10 songs, 20 subjects, trial=1, as used in the PredANN++ experiments.
  • Teacher / target source: MusicGen Entropy token sequences.

Training procedure

Multitask pretraining for 10000 epochs with 50% masking, seed 42. No downstream finetuning checkpoint is included in this repository.

Intended use

Continuing pretraining, finetuning into NMED-T Song ID or other EEG downstream tasks, and reproducing PredANN++ pretraining ablations.

Not intended use

  • Medical diagnosis, clinical decision making, or biometric identification.
  • Commercial use without checking the PredANN++ code license, NMED-T terms, and upstream model/feature licenses.
  • Immediate Song ID inference as a final classifier without loading the correct multitask module or performing downstream finetuning.

License and upstream dependencies

MusicGen / AudioCraft-derived features. Keep the released checkpoint under CC-BY-NC-4.0 for compatibility with the existing PredANN++ HF collection.

Reproducibility notes

  • The original source path at release time was: /data/Backup_AkamaUbuntu/home/sony_csl/workspace/noguchi/work/mind-model/Surprisal_Model/codes_3s/best_checkpoints/newMF_ctx16/multitask/EntropyMultitask_newMF/SongAcc/last.ckpt.
  • metadata.json stores the standardized release metadata.
  • SHA256SUMS stores the checkpoint checksum.
  • Use the PredANN++ GitHub repository for model definitions and evaluation scripts.

Links

Citation

If you use this checkpoint, cite the PredANN++ paper and the NMED-T dataset. For MuQ / MusicGen-derived teacher features, also cite the relevant upstream model or toolkit.

@misc{noguchi2026expectationacousticneuralnetwork,
  title={Expectation and Acoustic Neural Network Representations Enhance Music Identification from Brain Activity},
  author={Shogo Noguchi and Taketo Akama and Tai Nakamura and Shun Minamikawa and Natalia Polouliakh},
  year={2026},
  eprint={2603.03190},
  archivePrefix={arXiv},
  primaryClass={cs.AI},
  url={https://arxiv.org/abs/2603.03190}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including Shogo-Noguchi/PredANNpp-Pretrain-Entropy-ctx16-ep10000-seed42

Paper for Shogo-Noguchi/PredANNpp-Pretrain-Entropy-ctx16-ep10000-seed42