wrathofgod
/

music-descriptor-m3-v2

music-description

cross-attention

Model card Files Files and versions

Music Descriptor Module 3 v2

Dual-stream cross-attention model over:

M1 DeBERTa-small scene_vector (256-d) from wrathofgod/scene-perception-m1-unfreeze-deberta-small
M2 small-BERT narrative context_vector (256-d) from wrathofgod/narrative-context-m2

Architecture improvements over v1

Feature	v1	v2
Fusion	cat+Linear(512→256)	CrossAttentionFusion (4-head, bidirectional)
Head depth	2-layer	3-layer residual
Orch threshold	fixed 0.5	learned per-instrument (14 params)
Aux supervision	none	M2 tension/arousal/valence (weight=0.15)
Label smoothing	no	ε=0.1 on all CLS heads
LR schedule	CosineAnnealing	Warmup(3ep) + CosineAnnealing
SWA	no	last 5 epochs

8 Music Descriptor Heads

#	Head	Type	Output
1	tempo_bpm	regression	45–170 BPM
2	musical_valence	regression	-1.0 to 1.0
3	tonality	3-class	atonal, major, minor
4	harmonic_style	7-class	atonal…whole_tone
5	dynamic_shape_m4	8-class	crescendo…terraced
6	rhythm_style	6-class	drive…sparse
7	texture	5-class	ambient…solo
8	orchestration	14-label	ambient_pad…woodwinds

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support