ACE Step1.5XL_TurboSFT_Merge_50-50 โ Multi-Format (FP16 / FP8 / NVFP4)
Welcome to this repository! Here you will find a merged model of ACE-Step 1.5 XL Turbo and ACE-Step 1.5 XL SFT audio model in various quantization formats, optimized for different VRAM requirementsโespecially for use in ComfyUI.
๐ Original Model & Credits
This repository is a repack/merge based on the fantastic work of the ACE-Step team. Please visit and support the original creators here: ๐ Original ACE-Step 1.5 XL Collection
Example for Settings:
๐ Available Formats
Choose the format that best fits your hardware:
1. FP16 (16-Bit Half Precision)
- File Extension:
.safetensors(or marked accordingly) - Description: The uncompressed version. Offers the absolute highest audio quality but also requires the most VRAM. Ideal for high-end GPUs.
2. FP8 (8-Bit Quantization)
- File Extension:
*fp8.safetensors - Description: The perfect sweet spot. Halves the VRAM requirement compared to FP16 while keeping the audio quality nearly identical to the original. Highly recommended for most users.
3. NVFP4 (4-Bit Quantization) ๐
- File Extension:
*nvfp4.safetensors - Description: An extremely compressed version for minimal VRAM usage.
- Important Technical Note: Converting DiT audio models to 4-bit is highly experimental. To preserve audio quality and completely prevent the
Input tensor must be contiguouscrash in ComfyUI, critical sensitive layers (such asbias,norm,embed_tokens,timbre_encoder,project_in, andquantizer) were not quantized and intentionally left inbfloat16. Only the heavy Transformer blocks run in 4-bit. This makes the model stable and ready to use.
๐ ๏ธ Usage in ComfyUI
- Download your desired format (FP16, FP8, or NVFP4).
- Place the file in your ComfyUI directory under
models/diffusion_models(or the specific folder required by your audio node). - Load the model using your standard Model Loader Node.
Hosted by Starnodes
