roformer-models / README.md

AEmotionStudio

Add README.md

5c0dd47 verified 6 days ago

preview code

raw

history blame contribute delete

2.55 kB

metadata

license: mit
tags:
  - audio
  - music
  - source-separation
  - stem-separation
  - roformer
  - safetensors
  - maestraea
pipeline_tag: audio-to-audio

RoFormer Stem Separation Models (Safetensors)

BS-RoFormer & MelBand RoFormer — State-of-the-art music source separation

Pretrained weights converted to safetensors format for use with Mæstræa AI Workstation.

Models

BS-RoFormer (Band-Split RoPE Transformer)

Variant	SDR	Task	Path
Vocals (viperx)	12.97	Vocal/instrumental separation	`bs_roformer/vocals_viperx/`
Multi-stem	9.65	4-stem (bass/drums/vocals/other)	`bs_roformer/multistem/`

MelBand RoFormer (Mel-Band RoPE Transformer)

Variant	SDR	Task	Path
Vocals (KimberleyJensen)	10.98	Best vocal isolation	`mel_band_roformer/vocals_kj/`
Vocals (viperx)	11.43	Vocal/instrumental separation	`mel_band_roformer/vocals_viperx/`
Dereverb (anvuew)	19.17	Remove reverb from audio	`mel_band_roformer/dereverb/`
Denoise (aufr33)	27.99	Remove noise from audio	`mel_band_roformer/denoise/`

Architecture

Both models use the Band-Split RoPE Transformer architecture from lucidrains/BS-RoFormer:

BS-RoFormer: Splits spectrogram into uniform-width subbands
MelBand RoFormer: Splits using mel-scale (perceptually-weighted) overlapping bands

Both significantly outperform HTDemucs on vocal separation tasks.

Usage

Each model directory contains:

model.safetensors — Model weights
config.yaml — Architecture configuration (required for model instantiation)

Requires bs-roformer Python package: pip install bs-roformer

Credits

Architecture: lucidrains/BS-RoFormer
Training framework: ZFTurbo/Music-Source-Separation-Training
BS-RoFormer vocals: viperx via TRvlvr
MelBand vocals: KimberleyJensen, viperx
MelBand dereverb: anvuew
MelBand denoise: aufr33
Conversion & Mirror by: AEmotionStudio

License

MIT — same as all upstream model releases.