stable-audio-3-lab / README.md
owenisas's picture
Clean up optimization status metadata
b493d6c verified

A newer version of the Gradio SDK is available: 6.14.0

Upgrade
metadata
title: Stable Audio 3 Lab
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 6.3.0
app_file: app.py
python_version: '3.10'
suggested_hardware: zero-a10g
pinned: false
license: mit
hf_oauth: true
hf_oauth_scopes:
  - gated-repos

Stable Audio 3 Lab

Gradio Space for testing Stability AI's Stable Audio 3 collections:

  • Standard collection: stabilityai/stable-audio-3-small-music, stabilityai/stable-audio-3-small-sfx, stabilityai/stable-audio-3-medium
  • Extra collection generation checkpoints: small-music-base, small-sfx-base, medium-base
  • Extra collection autoencoders: SAME-S, SAME-L

The optimized repo (stabilityai/stable-audio-3-optimized) currently ships MLX and TensorRT assets rather than a generic model_config.json + model.safetensors checkpoint. This Space lists it in Coverage, but does not run it through the PyTorch stable_audio_3 path.

Access

This Space requires Hugging Face authentication. Users can either sign in with Hugging Face OAuth or paste a Hugging Face access token into the password field. The pasted token is used only for that request path and is not returned in run metadata.

The post-trained Stable Audio 3 checkpoints are gated on Hugging Face, so each user must:

  1. Sign in with Hugging Face.
  2. Or use a read token from their own Hugging Face account.
  3. Accept the terms on each gated model page from that account.

Base checkpoints are not gated, but they are intended mainly for fine-tuning and may not sound as polished.

Hardware

  • ZeroGPU is enabled through the spaces.GPU decorator on generation and autoencoder actions.
  • Small models can run on CPU, but GPU is still preferred.
  • Medium and Medium Base are GPU-first.
  • SAME-L is GPU-first; SAME-S can be used for CPU autoencoder round trips.

The Space is configured with suggested_hardware: zero-a10g.

Runtime note

The upstream stable-audio-3 Python package is vendored in this Space from Stability AI's public MIT-licensed repository because its package metadata pins Torch 2.7.1. ZeroGPU currently provides Torch 2.8.0, so installing the upstream package through normal dependency resolution would downgrade Torch and break the ZeroGPU runtime.

Optimization notes

  • Repeated runs with the same selected model reuse the loaded model inside the ZeroGPU worker when the worker stays warm. Run metadata includes cache_hit and load_elapsed_s so this is visible.
  • Successful gated-repo access checks are cached briefly inside the worker per token digest and repo ID to avoid a Hugging Face HEAD request on every generation.
  • The stable-audio-3-optimized repo currently provides MLX, ONNX, and TensorRT assets. This Space keeps the portable PyTorch path because the TensorRT engines are prebuilt for sm_90, while the current ZeroGPU host is a Blackwell GPU, and MLX is Apple-only.