stable-audio-3-lab / README.md
owenisas's picture
Clean up optimization status metadata
b493d6c verified
---
title: Stable Audio 3 Lab
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 6.3.0
app_file: app.py
python_version: "3.10"
suggested_hardware: zero-a10g
pinned: false
license: mit
hf_oauth: true
hf_oauth_scopes:
- gated-repos
---
# Stable Audio 3 Lab
Gradio Space for testing Stability AI's Stable Audio 3 collections:
- Standard collection: `stabilityai/stable-audio-3-small-music`, `stabilityai/stable-audio-3-small-sfx`, `stabilityai/stable-audio-3-medium`
- Extra collection generation checkpoints: `small-music-base`, `small-sfx-base`, `medium-base`
- Extra collection autoencoders: `SAME-S`, `SAME-L`
The optimized repo (`stabilityai/stable-audio-3-optimized`) currently ships MLX and TensorRT assets rather than a generic `model_config.json` + `model.safetensors` checkpoint. This Space lists it in Coverage, but does not run it through the PyTorch `stable_audio_3` path.
## Access
This Space requires Hugging Face authentication. Users can either sign in with
Hugging Face OAuth or paste a Hugging Face access token into the password field.
The pasted token is used only for that request path and is not returned in run
metadata.
The post-trained Stable Audio 3 checkpoints are gated on Hugging Face, so each
user must:
1. Sign in with Hugging Face.
2. Or use a read token from their own Hugging Face account.
3. Accept the terms on each gated model page from that account.
Base checkpoints are not gated, but they are intended mainly for fine-tuning and may not sound as polished.
## Hardware
- ZeroGPU is enabled through the `spaces.GPU` decorator on generation and autoencoder actions.
- Small models can run on CPU, but GPU is still preferred.
- Medium and Medium Base are GPU-first.
- `SAME-L` is GPU-first; `SAME-S` can be used for CPU autoencoder round trips.
The Space is configured with `suggested_hardware: zero-a10g`.
## Runtime note
The upstream `stable-audio-3` Python package is vendored in this Space from
Stability AI's public MIT-licensed repository because its package metadata pins
Torch 2.7.1. ZeroGPU currently provides Torch 2.8.0, so installing the upstream
package through normal dependency resolution would downgrade Torch and break the
ZeroGPU runtime.
## Optimization notes
- Repeated runs with the same selected model reuse the loaded model inside the
ZeroGPU worker when the worker stays warm. Run metadata includes `cache_hit`
and `load_elapsed_s` so this is visible.
- Successful gated-repo access checks are cached briefly inside the worker per
token digest and repo ID to avoid a Hugging Face `HEAD` request on every
generation.
- The `stable-audio-3-optimized` repo currently provides MLX, ONNX, and
TensorRT assets. This Space keeps the portable PyTorch path because the
TensorRT engines are prebuilt for `sm_90`, while the current ZeroGPU host is
a Blackwell GPU, and MLX is Apple-only.