Hybrid Transformers for Music Source Separation
Paper • 2211.08553 • Published • 1
4/6-Stem Source Separation — Vocals, Drums, Bass, Other (+Guitar, Piano)
Original Source by Facebook Research · MIT License
Converted from the original
.thcheckpoint format to safetensors for faster loading and safer deserialization. For use with Mæstræa AI Workstation.
| File | Stems | Size | Description |
|---|---|---|---|
htdemucs.safetensors |
4 (drums, bass, other, vocals) | 84 MB | Base model |
htdemucs_ft.safetensors |
4 (drums, bass, other, vocals) | 84 MB | Fine-tuned — best quality ⭐ |
htdemucs_6s.safetensors |
6 (drums, bass, other, vocals, guitar, piano) | 55 MB | 6-stem variant |
Each model has a matching *_config.json with architecture parameters (sources, sample rate, channels).
HTDemucs (Hybrid Transformer Demucs) separates mixed audio into individual stems:
These safetensors were converted from:
| Model | Original URL |
|---|---|
| htdemucs | https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/955717e8-8726e21a.th |
| htdemucs_ft | https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/04573f0d-f3cf25b2.th |
| htdemucs_6s | https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/5c90dfd2-34c22ccb.th |
These models are automatically downloaded by the Mæstræa AI Workstation backend. They can also be used directly with the demucs library:
from demucs.pretrained import get_model
model = get_model("htdemucs_ft")
MIT — same as the original Demucs release.