Refractor CDM

Refractor CDM (Compact Disc Module) is a lightweight MLP calibration head that classifies full-mix audio recordings into one of nine "rainbow colors" โ€” a chromatic taxonomy used in The Rainbow Table, an AI-assisted album series.

The CDM is a companion to the base Refractor ONNX model (a multimodal fusion network trained on short catalog segments). The base model works well for MIDI and short audio clips but predicts poorly on full-mix audio because CLAP embeddings are optimized for short segments. The CDM corrects this by training directly on chunked full-mix audio.

Model Details

Property Value
Architecture 2-layer MLP (256 โ†’ 128 โ†’ 9)
Parameters 361,993
Input CLAP audio (512-dim) + DeBERTa concept (768-dim) = 1280-dim
Output Softmax probabilities over 9 colors (color_probs, shape [batch, 9])
Format ONNX (refractor_cdm.onnx, 1.4 MB)
Training data 3,450 chunks from 78 full-mix songs across all 9 colors
Loss CrossEntropyLoss with label smoothing (0.1) + inverse-frequency class weights

Color Classes

Index  Color    CHROMATIC_TARGETS (temporal / spatial / ontological)
  0    Red      Past-heavy / Thing-heavy / Known-heavy
  1    Orange   Present-heavy / Thing-heavy / Known-heavy
  2    Yellow   Present-heavy / Place-heavy / Known-heavy
  3    Green    Present-heavy / Place-heavy / Known-heavy  <- same targets as Yellow
  4    Blue     Future-heavy / Place-heavy / Forgotten-heavy
  5    Indigo   Future-heavy / Future-heavy / Forgotten-heavy
  6    Violet   Future-heavy / Future-heavy / Imagined-heavy
  7    White    Uniform across all axes
  8    Black    Present-heavy / Thing-heavy / Imagined-heavy

Validation Results

Evaluated on 78 labeled songs from staged_raw_material using 30s/5s-stride chunked scoring with confidence-weighted aggregation.

Color Correct Total Accuracy
Red 11 12 91.7%
Orange 4 4 100.0%
Yellow 10 10 100.0%
Green 0 8 0.0% โš ๏ธ
Blue 11 11 100.0%
Indigo 10 11 90.9%
Violet 11 12 91.7%
White 0 10 0.0% โš ๏ธ
Overall 57 78 73.1%

Green (0%) โ€” all predicted as Yellow. This is pipeline-safe: Green and Yellow share identical CHROMATIC_TARGETS distributions, so downstream chromatic match and drift scores are unaffected.

White (0%) โ€” all predicted as Yellow or Blue. White's uniform [0.33, 0.34, 0.33] targets are meaningfully different, so this is a known open issue. White albums are musically intentionally diverse, which makes them acoustically diffuse in CLAP's feature space.

Usage

The CDM is used via the Refractor wrapper. It auto-loads when refractor_cdm.onnx is present alongside refractor.onnx.

from training.refractor import Refractor

scorer = Refractor()  # CDM auto-detected

result = scorer.score(
    audio_emb=scorer.prepare_audio(waveform, sr=48000),
    concept_emb=scorer.prepare_concept("A song about forgetting the future"),
)
# result: {"temporal": {...}, "spatial": {...}, "ontological": {...}, "confidence": 0.93}

For full-mix WAV files, use chunk_audio + aggregate_chunk_scores from score_mix.py to score in overlapping windows and pool results.

Training

# Phase 1 โ€” extract CLAP + concept embeddings from staged_raw_material/
python training/extract_cdm_embeddings.py

# Phase 2 โ€” train on Modal (A10G GPU)
modal run training/modal_train_refractor_cdm.py

# Validate
python training/validate_mix_scoring.py

Limitations

  • CLAP embeddings have a maximum internal window of ~10s; chunked scoring is essential for full-length tracks
  • Green and White classification are unreliable (see validation results above)
  • Training data is drawn from a single artist's catalog โ€” generalization to other music is untested
  • The concept embedding path requires a DeBERTa-v3-base inference pass (~600 MB model)

Citation

Part of The Rainbow Table generative music pipeline. See brotherclone/white and earthlyframes/white-training-data.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for earthlyframes/refractor_cdm

Quantized
(1)
this model