Add README
Browse files
README.md
ADDED
|
@@ -0,0 +1,77 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
tags:
|
| 4 |
+
- audio
|
| 5 |
+
- audio-separation
|
| 6 |
+
- stem-separation
|
| 7 |
+
- demucs
|
| 8 |
+
- htdemucs
|
| 9 |
+
- safetensors
|
| 10 |
+
- maestraea
|
| 11 |
+
pipeline_tag: audio-to-audio
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
# HTDemucs Models (Safetensors)
|
| 15 |
+
|
| 16 |
+
**4/6-Stem Source Separation — Vocals, Drums, Bass, Other (+Guitar, Piano)**
|
| 17 |
+
|
| 18 |
+
[Original Source](https://github.com/facebookresearch/demucs) by [Facebook Research](https://github.com/facebookresearch) · MIT License
|
| 19 |
+
|
| 20 |
+
> Converted from the original `.th` checkpoint format to safetensors for faster loading and safer deserialization. For use with [Mæstræa AI Workstation](https://github.com/AEmotionStudio/Maestraea).
|
| 21 |
+
|
| 22 |
+
## Available Models
|
| 23 |
+
|
| 24 |
+
| File | Stems | Size | Description |
|
| 25 |
+
|------|-------|------|-------------|
|
| 26 |
+
| `htdemucs.safetensors` | 4 (drums, bass, other, vocals) | 84 MB | Base model |
|
| 27 |
+
| `htdemucs_ft.safetensors` | 4 (drums, bass, other, vocals) | 84 MB | **Fine-tuned** — best quality ⭐ |
|
| 28 |
+
| `htdemucs_6s.safetensors` | 6 (drums, bass, other, vocals, guitar, piano) | 55 MB | 6-stem variant |
|
| 29 |
+
|
| 30 |
+
Each model has a matching `*_config.json` with architecture parameters (sources, sample rate, channels).
|
| 31 |
+
|
| 32 |
+
## What HTDemucs Does
|
| 33 |
+
|
| 34 |
+
HTDemucs (Hybrid Transformer Demucs) separates mixed audio into individual stems:
|
| 35 |
+
|
| 36 |
+
- **Vocals** — Singing, spoken word
|
| 37 |
+
- **Drums** — Percussion, kick, snare, hi-hat
|
| 38 |
+
- **Bass** — Bass guitar, synth bass
|
| 39 |
+
- **Other** — Everything else (keys, synths, FX)
|
| 40 |
+
- **Guitar** — (6-stem model only)
|
| 41 |
+
- **Piano** — (6-stem model only)
|
| 42 |
+
|
| 43 |
+
### Key Features
|
| 44 |
+
|
| 45 |
+
- Real-time capable on GPU
|
| 46 |
+
- Adjustable segment size for VRAM control
|
| 47 |
+
- Best-in-class separation quality (htdemucs_ft)
|
| 48 |
+
- ~4–6 GB VRAM
|
| 49 |
+
|
| 50 |
+
## Original Checkpoint URLs
|
| 51 |
+
|
| 52 |
+
These safetensors were converted from:
|
| 53 |
+
|
| 54 |
+
| Model | Original URL |
|
| 55 |
+
|-------|-------------|
|
| 56 |
+
| htdemucs | `https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/955717e8-8726e21a.th` |
|
| 57 |
+
| htdemucs_ft | `https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/04573f0d-f3cf25b2.th` |
|
| 58 |
+
| htdemucs_6s | `https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/5c90dfd2-34c22ccb.th` |
|
| 59 |
+
|
| 60 |
+
## Usage with Mæstræa
|
| 61 |
+
|
| 62 |
+
These models are automatically downloaded by the Mæstræa AI Workstation backend. They can also be used directly with the `demucs` library:
|
| 63 |
+
|
| 64 |
+
```python
|
| 65 |
+
from demucs.pretrained import get_model
|
| 66 |
+
model = get_model("htdemucs_ft")
|
| 67 |
+
```
|
| 68 |
+
|
| 69 |
+
## License
|
| 70 |
+
|
| 71 |
+
MIT — same as the original Demucs release.
|
| 72 |
+
|
| 73 |
+
## Credits
|
| 74 |
+
|
| 75 |
+
- **Model**: [Facebook Research / Meta AI](https://github.com/facebookresearch/demucs)
|
| 76 |
+
- **Paper**: [Hybrid Transformers for Music Source Separation](https://arxiv.org/abs/2211.08553) (Rouard et al., 2023)
|
| 77 |
+
- **Conversion & Mirror by**: [AEmotionStudio](https://huggingface.co/AEmotionStudio)
|