AEmotionStudio commited on
Commit
e6fd658
·
verified ·
1 Parent(s): 7ccb25d

Add README

Browse files
Files changed (1) hide show
  1. README.md +77 -0
README.md ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - audio
5
+ - audio-separation
6
+ - stem-separation
7
+ - demucs
8
+ - htdemucs
9
+ - safetensors
10
+ - maestraea
11
+ pipeline_tag: audio-to-audio
12
+ ---
13
+
14
+ # HTDemucs Models (Safetensors)
15
+
16
+ **4/6-Stem Source Separation — Vocals, Drums, Bass, Other (+Guitar, Piano)**
17
+
18
+ [Original Source](https://github.com/facebookresearch/demucs) by [Facebook Research](https://github.com/facebookresearch) · MIT License
19
+
20
+ > Converted from the original `.th` checkpoint format to safetensors for faster loading and safer deserialization. For use with [Mæstræa AI Workstation](https://github.com/AEmotionStudio/Maestraea).
21
+
22
+ ## Available Models
23
+
24
+ | File | Stems | Size | Description |
25
+ |------|-------|------|-------------|
26
+ | `htdemucs.safetensors` | 4 (drums, bass, other, vocals) | 84 MB | Base model |
27
+ | `htdemucs_ft.safetensors` | 4 (drums, bass, other, vocals) | 84 MB | **Fine-tuned** — best quality ⭐ |
28
+ | `htdemucs_6s.safetensors` | 6 (drums, bass, other, vocals, guitar, piano) | 55 MB | 6-stem variant |
29
+
30
+ Each model has a matching `*_config.json` with architecture parameters (sources, sample rate, channels).
31
+
32
+ ## What HTDemucs Does
33
+
34
+ HTDemucs (Hybrid Transformer Demucs) separates mixed audio into individual stems:
35
+
36
+ - **Vocals** — Singing, spoken word
37
+ - **Drums** — Percussion, kick, snare, hi-hat
38
+ - **Bass** — Bass guitar, synth bass
39
+ - **Other** — Everything else (keys, synths, FX)
40
+ - **Guitar** — (6-stem model only)
41
+ - **Piano** — (6-stem model only)
42
+
43
+ ### Key Features
44
+
45
+ - Real-time capable on GPU
46
+ - Adjustable segment size for VRAM control
47
+ - Best-in-class separation quality (htdemucs_ft)
48
+ - ~4–6 GB VRAM
49
+
50
+ ## Original Checkpoint URLs
51
+
52
+ These safetensors were converted from:
53
+
54
+ | Model | Original URL |
55
+ |-------|-------------|
56
+ | htdemucs | `https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/955717e8-8726e21a.th` |
57
+ | htdemucs_ft | `https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/04573f0d-f3cf25b2.th` |
58
+ | htdemucs_6s | `https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/5c90dfd2-34c22ccb.th` |
59
+
60
+ ## Usage with Mæstræa
61
+
62
+ These models are automatically downloaded by the Mæstræa AI Workstation backend. They can also be used directly with the `demucs` library:
63
+
64
+ ```python
65
+ from demucs.pretrained import get_model
66
+ model = get_model("htdemucs_ft")
67
+ ```
68
+
69
+ ## License
70
+
71
+ MIT — same as the original Demucs release.
72
+
73
+ ## Credits
74
+
75
+ - **Model**: [Facebook Research / Meta AI](https://github.com/facebookresearch/demucs)
76
+ - **Paper**: [Hybrid Transformers for Music Source Separation](https://arxiv.org/abs/2211.08553) (Rouard et al., 2023)
77
+ - **Conversion & Mirror by**: [AEmotionStudio](https://huggingface.co/AEmotionStudio)