roformer-models / README.md
AEmotionStudio's picture
Add README.md
5c0dd47 verified
metadata
license: mit
tags:
  - audio
  - music
  - source-separation
  - stem-separation
  - roformer
  - safetensors
  - maestraea
pipeline_tag: audio-to-audio

RoFormer Stem Separation Models (Safetensors)

BS-RoFormer & MelBand RoFormer — State-of-the-art music source separation

Pretrained weights converted to safetensors format for use with Mæstræa AI Workstation.

Models

BS-RoFormer (Band-Split RoPE Transformer)

Variant SDR Task Path
Vocals (viperx) 12.97 Vocal/instrumental separation bs_roformer/vocals_viperx/
Multi-stem 9.65 4-stem (bass/drums/vocals/other) bs_roformer/multistem/

MelBand RoFormer (Mel-Band RoPE Transformer)

Variant SDR Task Path
Vocals (KimberleyJensen) 10.98 Best vocal isolation mel_band_roformer/vocals_kj/
Vocals (viperx) 11.43 Vocal/instrumental separation mel_band_roformer/vocals_viperx/
Dereverb (anvuew) 19.17 Remove reverb from audio mel_band_roformer/dereverb/
Denoise (aufr33) 27.99 Remove noise from audio mel_band_roformer/denoise/

Architecture

Both models use the Band-Split RoPE Transformer architecture from lucidrains/BS-RoFormer:

  • BS-RoFormer: Splits spectrogram into uniform-width subbands
  • MelBand RoFormer: Splits using mel-scale (perceptually-weighted) overlapping bands

Both significantly outperform HTDemucs on vocal separation tasks.

Usage

Each model directory contains:

  • model.safetensors — Model weights
  • config.yaml — Architecture configuration (required for model instantiation)

Requires bs-roformer Python package: pip install bs-roformer

Credits

License

MIT — same as all upstream model releases.