Christina TTS (NF4 Quantized)

This is an NF4 quantized version of Christina TTS.

The source model is Loke-60000/christina-TTS, which was fine-tuned from Qwen/Qwen3-TTS-12Hz-0.6B-Base.

About the Voice

Important disclaimer: The voice actress in this model is not Asami Imai or any other official voice actor. This is a fan recreation using synthetic and real data, which explains any differences from the original character's voice.

The voice actress who lent her voice wishes to remain private. Thank you for your understanding.

Training data includes synthetic material and real recordings, including samples from Loke-60000/Christina-TTS-I.

Quantization

Format: bitsandbytes 4-bit NF4
Source model: Loke-60000/christina-TTS
Intended use: inference
Compatibility: drop-in replacement for the original inference code, with the model path changed to this repo
Runtime requirement: install bitsandbytes

Speakers

This model includes two voice variants, each optimized for different use cases:

Speaker	ID	Best For
`christina`	3000	English speech
`christina-jp`	3001	Japanese speech

Why Two Variants?

christina: Trained primarily on English data. Produces natural English speech with proper intonation and rhythm. When speaking Japanese, it may lose some native intonation patterns.
christina-jp: Trained primarily on Japanese data. Produces natural Japanese speech. When speaking English, it retains a Japanese accent, which may be desirable for certain character portrayals.

Choose the variant that matches your primary output language for the most natural results.

Language Support

The base model theoretically supports languages beyond English and Japanese. However, no quality guarantees are made for other languages, since this fine-tune was specifically trained for English and Japanese use.

If another language seems important for your use case, feel free to contact me and we can discuss it for a future update.

Quickstart

Installation

pip install -U qwen-tts bitsandbytes
# Optional: for optimized performance
pip install -U flash-attn --no-build-isolation

Quick Usage

import torch
import soundfile as sf
from qwen_tts import Qwen3TTSModel

model = Qwen3TTSModel.from_pretrained(
    "Loke-60000/christina-TTS-nf4",
    device_map="cuda:0",
    dtype=torch.bfloat16,
    attn_implementation="flash_attention_2",
)

# English-focused voice
wavs_en, sr = model.generate_custom_voice(
    text="I finally managed to finish the experiment.",
    speaker="christina",
    language="English",
)
sf.write("christina_en.wav", wavs_en[0], sr)

# Japanese-focused voice
wavs_ja, sr = model.generate_custom_voice(
    text="やっと実験が終わったわ。",
    speaker="christina-jp",
    language="Japanese",
)
sf.write("christina_jp.wav", wavs_ja[0], sr)

You can also inspect the available voices with model.get_supported_speakers().

License & Usage

This model is free to download and use. However, no training data will be shared.

If you use this model in any project, please provide credit.

Downloads last month: 16

Safetensors

Model size

0.9B params

Tensor type

F32

BF16

Model tree for Loke-60000/christina-TTS-nf4

Base model

Qwen/Qwen3-TTS-12Hz-0.6B-Base

Finetuned

Loke-60000/christina-TTS

Quantized

(1)

this model