ACE-Step 1.5 MLX (4-bit Quantized)

4-bit quantized MLX weights for ACE-Step/ACE-Step1.5.

  • Decoder and encoder quantized to 4-bit (group_size=64)
  • VAE, tokenizer, and detokenizer kept in full precision
  • 2.2GB main model + 0.7GB VAE + 2.4GB text encoder

Usage

from mlx_audio.tts import load

model = load("mlx-community/ACE-Step1.5-MLX-4bit")

for result in model.generate(
    text="upbeat electronic dance music with energetic synthesizers",
    duration=30.0,
):
    audio = result.audio  # [samples, 2] stereo @ 48kHz
    sample_rate = result.sample_rate

With Vocals

for result in model.generate(
    text="English pop song with clear female vocals, catchy melody",
    lyrics="""[verse]
Dance with me tonight
Under the neon lights

[chorus]
We're alive, we're on fire
Dancing higher and higher
""",
    duration=60.0,
    vocal_language="en",
):
    ...

The model uses a 5Hz Language Model planner by default (use_lm=True) which generates a song blueprint before running the diffusion transformer.

Downloads last month
50
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support