Qwen3-TTS-12Hz-1.7B-Base - 4-bit Quantized (bitsandbytes)

This is a 4-bit quantized version of Qwen/Qwen3-TTS-12Hz-1.7B-Base configured for automatic bitsandbytes quantization at load time.

How It Works

This model contains the original weights plus a quantization_config in config.json. When you load the model, it will automatically be quantized to 4-bit using bitsandbytes NF4 quantization.

Memory savings: ~75% reduction compared to full precision.

Requirements

pip install qwen-tts bitsandbytes>=0.42.0 accelerate

Usage

from qwen_tts import Qwen3TTSModel
import torch

# Model will be automatically quantized when loaded
model = Qwen3TTSModel.from_pretrained(
    "YOUR_USERNAME/Qwen3-TTS-12Hz-1.7B-Base-BNB-4bit",
    device_map="auto",
)

# Voice cloning example
import soundfile as sf

wavs, sr = model.generate_voice_clone(
    text="Hello, this is a test of the quantized model.",
    language="English",
    ref_audio="path/to/reference.wav",
    ref_text="Transcript of your reference audio.",
)
sf.write("output.wav", wavs[0], sr)

Quantization Details

Setting Value
Method bitsandbytes
Bits 4
Quant Type NF4 (Normalized Float 4)
Compute Dtype bfloat16
Double Quant Yes

Original Model

Based on Qwen/Qwen3-TTS-12Hz-1.7B-Base. Please refer to the original model card for full documentation.

License

Apache 2.0 (same as original model)

Downloads last month
101
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for divyajot5005/Qwen3-TTS-12Hz-1.7B-Base-BNB-4bit

Quantized
(8)
this model