Spaces:
Running on Zero
Running on Zero
| title: Stable Audio 3 Lab | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: gradio | |
| sdk_version: 6.3.0 | |
| app_file: app.py | |
| python_version: "3.10" | |
| suggested_hardware: zero-a10g | |
| pinned: false | |
| license: mit | |
| hf_oauth: true | |
| hf_oauth_scopes: | |
| - gated-repos | |
| # Stable Audio 3 Lab | |
| Gradio Space for testing Stability AI's Stable Audio 3 collections: | |
| - Standard collection: `stabilityai/stable-audio-3-small-music`, `stabilityai/stable-audio-3-small-sfx`, `stabilityai/stable-audio-3-medium` | |
| - Extra collection generation checkpoints: `small-music-base`, `small-sfx-base`, `medium-base` | |
| - Extra collection autoencoders: `SAME-S`, `SAME-L` | |
| The optimized repo (`stabilityai/stable-audio-3-optimized`) currently ships MLX and TensorRT assets rather than a generic `model_config.json` + `model.safetensors` checkpoint. This Space lists it in Coverage, but does not run it through the PyTorch `stable_audio_3` path. | |
| ## Access | |
| This Space requires Hugging Face authentication. Users can either sign in with | |
| Hugging Face OAuth or paste a Hugging Face access token into the password field. | |
| The pasted token is used only for that request path and is not returned in run | |
| metadata. | |
| The post-trained Stable Audio 3 checkpoints are gated on Hugging Face, so each | |
| user must: | |
| 1. Sign in with Hugging Face. | |
| 2. Or use a read token from their own Hugging Face account. | |
| 3. Accept the terms on each gated model page from that account. | |
| Base checkpoints are not gated, but they are intended mainly for fine-tuning and may not sound as polished. | |
| ## Hardware | |
| - ZeroGPU is enabled through the `spaces.GPU` decorator on generation and autoencoder actions. | |
| - Small models can run on CPU, but GPU is still preferred. | |
| - Medium and Medium Base are GPU-first. | |
| - `SAME-L` is GPU-first; `SAME-S` can be used for CPU autoencoder round trips. | |
| The Space is configured with `suggested_hardware: zero-a10g`. | |
| ## Runtime note | |
| The upstream `stable-audio-3` Python package is vendored in this Space from | |
| Stability AI's public MIT-licensed repository because its package metadata pins | |
| Torch 2.7.1. ZeroGPU currently provides Torch 2.8.0, so installing the upstream | |
| package through normal dependency resolution would downgrade Torch and break the | |
| ZeroGPU runtime. | |
| ## Optimization notes | |
| - Repeated runs with the same selected model reuse the loaded model inside the | |
| ZeroGPU worker when the worker stays warm. Run metadata includes `cache_hit` | |
| and `load_elapsed_s` so this is visible. | |
| - Successful gated-repo access checks are cached briefly inside the worker per | |
| token digest and repo ID to avoid a Hugging Face `HEAD` request on every | |
| generation. | |
| - The `stable-audio-3-optimized` repo currently provides MLX, ONNX, and | |
| TensorRT assets. This Space keeps the portable PyTorch path because the | |
| TensorRT engines are prebuilt for `sm_90`, while the current ZeroGPU host is | |
| a Blackwell GPU, and MLX is Apple-only. | |