AEmotionStudio commited on
Commit
815570d
·
verified ·
1 Parent(s): 251871f

Add README for Mæstræa mirror

Browse files
Files changed (1) hide show
  1. README.md +106 -0
README.md ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: stability-ai-community
4
+ license_link: LICENSE.md
5
+ tags:
6
+ - audio
7
+ - text-to-audio
8
+ - sound-effects
9
+ - ambient
10
+ - diffusion
11
+ - stable-audio
12
+ - safetensors
13
+ - maestraea
14
+ pipeline_tag: text-to-audio
15
+ base_model: stabilityai/stable-audio-open-1.0
16
+ ---
17
+
18
+ # Stable Audio Open 1.0 (Mæstræa Mirror)
19
+
20
+ **Text-to-Audio SFX & Ambient Textures — Up to 47s Stereo @ 44.1kHz**
21
+
22
+ [Original Model](https://huggingface.co/stabilityai/stable-audio-open-1.0) by [Stability AI](https://stability.ai/) · Stability AI Community License
23
+
24
+ > This is an **ungated mirror** of the Stable Audio Open 1.0 model weights for use with [Mæstræa AI Workstation](https://github.com/AEmotionStudio/Maestraea). Only safetensors-format weights are included (legacy `.ckpt` files stripped). All credits go to the original authors.
25
+
26
+ ## What's in This Repo
27
+
28
+ | Path | Description | Size |
29
+ |------|-------------|------|
30
+ | `model.safetensors` | Main model checkpoint | ~3 GB |
31
+ | `transformer/diffusion_pytorch_model.safetensors` | DiT transformer | ~1.5 GB |
32
+ | `text_encoder/model.safetensors` | T5 text encoder | ~1.2 GB |
33
+ | `vae/diffusion_pytorch_model.safetensors` | VAE decoder | ~150 MB |
34
+ | `projection_model/diffusion_pytorch_model.safetensors` | Projection model | ~50 MB |
35
+ | `tokenizer/` | T5 tokenizer files | < 10 MB |
36
+ | `model_config.json` | Model architecture config | < 1 KB |
37
+ | `model_index.json` | Diffusers pipeline index | < 1 KB |
38
+ | `scheduler/` | Scheduler config | < 1 KB |
39
+
40
+ ## What Stable Audio Open Does
41
+
42
+ Stable Audio Open generates stereo audio at 44.1kHz from text prompts. It excels at:
43
+
44
+ - **Sound effects** — Foley, impacts, transitions
45
+ - **Ambient textures** — Rain, wind, crowds, environments
46
+ - **Musical textures** — Pads, drones, atmospheric sounds
47
+ - **Audio scenes** — Complex layered soundscapes
48
+
49
+ Up to 47 seconds of stereo audio per generation.
50
+
51
+ ### What It's NOT Good At
52
+
53
+ - Full songs with vocals
54
+ - High-fidelity musical instruments (use Foundation-1 for that)
55
+ - Speech synthesis
56
+
57
+ ### VRAM Requirements
58
+
59
+ - **Minimum**: ~4 GB (FP16)
60
+ - **Recommended**: ~7 GB (FP16, longer durations)
61
+
62
+ ## Usage with Mæstræa
63
+
64
+ These models are automatically downloaded by the Mæstræa AI Workstation backend.
65
+
66
+ ### Direct Usage (diffusers)
67
+
68
+ ```python
69
+ from diffusers import StableAudioPipeline
70
+ import torch
71
+
72
+ pipe = StableAudioPipeline.from_pretrained(
73
+ "AEmotionStudio/stable-audio-open-models",
74
+ torch_dtype=torch.float16,
75
+ ).to("cuda")
76
+
77
+ audio = pipe(
78
+ prompt="Thunderstorm with heavy rain and distant rolling thunder",
79
+ negative_prompt="low quality, distorted",
80
+ audio_end_in_s=10.0,
81
+ num_inference_steps=100,
82
+ ).audios[0]
83
+ ```
84
+
85
+ ### Using stable-audio-tools
86
+
87
+ ```python
88
+ from stable_audio_tools import get_pretrained_model
89
+ model, model_config = get_pretrained_model("AEmotionStudio/stable-audio-open-models")
90
+ ```
91
+
92
+ ## License
93
+
94
+ **Stability AI Community License** — see [LICENSE.md](LICENSE.md) for full terms.
95
+
96
+ Key points:
97
+ - Free for research and non-commercial use
98
+ - Commercial use requires revenue < $1M/year or a separate license from Stability AI
99
+ - Model outputs cannot be used to train competing models
100
+
101
+ ## Credits
102
+
103
+ - **Model**: [Stability AI](https://stability.ai/)
104
+ - **Paper**: [Stable Audio Open](https://stability.ai/research/stable-audio-open)
105
+ - **Training Data**: FreeSound + Free Music Archive (see attribution CSVs)
106
+ - **Mirror by**: [AEmotionStudio](https://huggingface.co/AEmotionStudio)