Pulse88-40M-Alpha-Preview Architectural Variant E is a high-efficiency, 40.8 million parameter causal piano continuation model trained on 86k pieces from the Godzilla MIDI Dataset. It utilizes a hybrid architecture combining Gated Delta Networks (GDN) with sparse Grouped-Query Attention (GQA) anchors to achieve long-context musical coherence.

Bullet Points

Architecture: Hybrid Gated Delta Network + Sparse GQA
Parameters: 40,784,528 (~40.8M)
Vocabulary: 171-token event vocabulary (delta onset, pitch, duration, velocity)
Context Window: 2048 tokens (512 seed / 1536 continuation)
Training Data: Godzilla MIDI Dataset Piano Subset (86k piano pieces)

Variant E 40M Architecture Summary

Variant E is a decoder-only autoregressive piano MIDI model built on a custom 171-token event vocabulary with event quads (delta, pitch, duration, velocity) and event size 4. The 40M profile uses d_model 640, 13 layers, and a 2048-token context window (512 seed plus 1536 continuation). Each layer is pre-norm residual and stacks two Gated Delta Net blocks, with sparse grouped-query attention anchors inserted every 2 layers and always in the final layer. In the 13-layer profile this gives attention anchors at layers 2, 4, 6, 8, 10, 12, and 13. Token embedding and output head are tied, dropout is 0.1, and output logits use 1/sqrt(d_model) scaling.

For the 40M shape, GDN runs with inner_dim 320 and 4 heads; attention runs with 8 query heads and grouped KV sharing (groups 4, effective KV heads 2). The training notebook enforces strict real GDN kernels (flash-linear-attention required) and blocks fallback when strict mode is enabled. Training is configured for pretokenized NPZ manifests up to 100k pieces, with optimizer/schedule settings of AdamW, learning rate 2e-4, cosine decay, dynamic warmup resolution, weight decay 0.01, label smoothing 0.1, and max grad norm 1.0. The run is set up for dual-T4 DDP on Kaggle (one process per GPU) with checkpoint resume flow, which matches your two-session training setup.

Training

See the training_logs.txt for exact loss numbers.

Dataset

The model was trained on the Godzilla MIDI Dataset, specifically 86,000 pieces from the piano subset. This dataset, created by Project Los Angeles (Aleksandr Lev), provides a massive and diverse corpus of MIDI data that allows the model to learn complex harmonic structures and temporal continuity.

Demo

Bluebird Continuation

Single Note Continuation In this generation the model was given only a single note (C4).
For optimal results, a longer seed is recommended.

---

Other Generations

God Rest Ye Merry Gentlemen Continuation

Continuation of a simple motif

Sabrina by John Williams

Wii Channel Continuation

Audio rendered with Advanced MIDI Renderer

Intended Use

Research and experimentation in symbolic piano continuation
Evaluation of GDN + sparse attention design choices

Limitations

The model is limited to piano-only MIDI data and does not generalize to multi-instrument compositions.
Performance degrades with very short or highly irregular input seeds.
The model may produce repetitive or unstable outputs over long continuations.
As an alpha preview, the model has not been extensively optimized for musical quality or stylistic control.

Project Future and Purpose

The purpose of this project is the research of novel architectures in the symbolic music Machine learning space. This is a small scale preview of what is to come. I plan to continue working on the architecture to keep it on the bleeding edge of technology.

Warranty

This model is intended for research purposes only. It is provided “as is,” without any warranties, express or implied. The authors make no guarantees regarding its performance, reliability, or fitness for a particular purpose. Use at your own risk.

Citation & Credits

If you use this model, please credit the original data source:

@misc{GodzillaMIDIDataset2025,
  title        = {Godzilla MIDI Dataset: Enormous, comprehensive, normalized and searchable MIDI dataset for MIR and symbolic music AI purposes},
  author       = {Alex Lev},
  publisher    = {Project Los Angeles / Tegridy Code},
  year         = {2025},
  url          = {https://huggingface.co/datasets/projectlosangeles/Godzilla-MIDI-Dataset}

@inproceedings{lev2026tegridytools,
    title       = {tegridy-tools: Symbolic Music NLP Artificial Intelligence Toolkit},
    author      = {Aleksandr Lev},
    booktitle   = {GitHub},
    year        = {2026},
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Chickaboo
/

Pulse88-E-40M-Alpha-Preview

Bullet Points

Variant E 40M Architecture Summary

Training

Dataset

Demo

Other Generations

Continuation of a simple motif

Sabrina by John Williams

Wii Channel Continuation

Intended Use

Limitations

Project Future and Purpose

Warranty

Citation & Credits

Dataset used to train Chickaboo/Pulse88-E-40M-Alpha-Preview

Bullet Points

Variant E 40M Architecture Summary

Training

Dataset

Demo

Other Generations

Continuation of a simple motif Your browser does not support the audio element.

Sabrina by John Williams Your browser does not support the audio element.

Wii Channel Continuation Your browser does not support the audio element.

Intended Use

Limitations

Project Future and Purpose

Warranty

Citation & Credits

Dataset used to train Chickaboo/Pulse88-E-40M-Alpha-Preview

Continuation of a simple motif

Sabrina by John Williams

Wii Channel Continuation