--- license: mit tags: - audio - music - source-separation - stem-separation - roformer - safetensors - maestraea pipeline_tag: audio-to-audio --- # RoFormer Stem Separation Models (Safetensors) **BS-RoFormer & MelBand RoFormer — State-of-the-art music source separation** > Pretrained weights converted to safetensors format for use with [Mæstræa AI Workstation](https://github.com/AEmotionStudio/Maestraea). ## Models ### BS-RoFormer (Band-Split RoPE Transformer) | Variant | SDR | Task | Path | |---------|-----|------|------| | Vocals (viperx) | 12.97 | Vocal/instrumental separation | `bs_roformer/vocals_viperx/` | | Multi-stem | 9.65 | 4-stem (bass/drums/vocals/other) | `bs_roformer/multistem/` | ### MelBand RoFormer (Mel-Band RoPE Transformer) | Variant | SDR | Task | Path | |---------|-----|------|------| | Vocals (KimberleyJensen) | 10.98 | Best vocal isolation | `mel_band_roformer/vocals_kj/` | | Vocals (viperx) | 11.43 | Vocal/instrumental separation | `mel_band_roformer/vocals_viperx/` | | Dereverb (anvuew) | 19.17 | Remove reverb from audio | `mel_band_roformer/dereverb/` | | Denoise (aufr33) | 27.99 | Remove noise from audio | `mel_band_roformer/denoise/` | ## Architecture Both models use the Band-Split RoPE Transformer architecture from [lucidrains/BS-RoFormer](https://github.com/lucidrains/BS-RoFormer): - **BS-RoFormer**: Splits spectrogram into uniform-width subbands - **MelBand RoFormer**: Splits using mel-scale (perceptually-weighted) overlapping bands Both significantly outperform HTDemucs on vocal separation tasks. ## Usage Each model directory contains: - `model.safetensors` — Model weights - `config.yaml` — Architecture configuration (required for model instantiation) Requires `bs-roformer` Python package: `pip install bs-roformer` ## Credits - **Architecture**: [lucidrains/BS-RoFormer](https://github.com/lucidrains/BS-RoFormer) - **Training framework**: [ZFTurbo/Music-Source-Separation-Training](https://github.com/ZFTurbo/Music-Source-Separation-Training) - **BS-RoFormer vocals**: [viperx](https://github.com/playdasegunda) via [TRvlvr](https://github.com/TRvlvr/model_repo) - **MelBand vocals**: [KimberleyJensen](https://github.com/KimberleyJensen), [viperx](https://github.com/playdasegunda) - **MelBand dereverb**: [anvuew](https://github.com/anvuew) - **MelBand denoise**: [aufr33](https://github.com/aufr33) - **Conversion & Mirror by**: [AEmotionStudio](https://huggingface.co/AEmotionStudio) ## License MIT — same as all upstream model releases.