Christina TTS (NF4 Quantized)

This is an NF4 quantized version of Christina TTS.

The source model is Loke-60000/christina-TTS, which was fine-tuned from Qwen/Qwen3-TTS-12Hz-0.6B-Base.

About the Voice

Important disclaimer: The voice actress in this model is not Asami Imai or any other official voice actor. This is a fan recreation using synthetic and real data, which explains any differences from the original character's voice.

The voice actress who lent her voice wishes to remain private. Thank you for your understanding.

Training data includes synthetic material and real recordings, including samples from Loke-60000/Christina-TTS-I.

Quantization

  • Format: bitsandbytes 4-bit NF4
  • Source model: Loke-60000/christina-TTS
  • Intended use: inference
  • Compatibility: drop-in replacement for the original inference code, with the model path changed to this repo
  • Runtime requirement: install bitsandbytes

Speakers

This model includes two voice variants, each optimized for different use cases:

Speaker ID Best For
christina 3000 English speech
christina-jp 3001 Japanese speech

Why Two Variants?

  • christina: Trained primarily on English data. Produces natural English speech with proper intonation and rhythm. When speaking Japanese, it may lose some native intonation patterns.

  • christina-jp: Trained primarily on Japanese data. Produces natural Japanese speech. When speaking English, it retains a Japanese accent, which may be desirable for certain character portrayals.

Choose the variant that matches your primary output language for the most natural results.

Language Support

The base model theoretically supports languages beyond English and Japanese. However, no quality guarantees are made for other languages, since this fine-tune was specifically trained for English and Japanese use.

If another language seems important for your use case, feel free to contact me and we can discuss it for a future update.

Quickstart

Installation

pip install -U qwen-tts bitsandbytes
# Optional: for optimized performance
pip install -U flash-attn --no-build-isolation

Quick Usage

import torch
import soundfile as sf
from qwen_tts import Qwen3TTSModel

model = Qwen3TTSModel.from_pretrained(
    "Loke-60000/christina-TTS-nf4",
    device_map="cuda:0",
    dtype=torch.bfloat16,
    attn_implementation="flash_attention_2",
)

# English-focused voice
wavs_en, sr = model.generate_custom_voice(
    text="I finally managed to finish the experiment.",
    speaker="christina",
    language="English",
)
sf.write("christina_en.wav", wavs_en[0], sr)

# Japanese-focused voice
wavs_ja, sr = model.generate_custom_voice(
    text="やっと実験が終わったわ。",
    speaker="christina-jp",
    language="Japanese",
)
sf.write("christina_jp.wav", wavs_ja[0], sr)

You can also inspect the available voices with model.get_supported_speakers().

License & Usage

This model is free to download and use. However, no training data will be shared.

If you use this model in any project, please provide credit.

Downloads last month
16
Safetensors
Model size
0.9B params
Tensor type
F32
·
BF16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Loke-60000/christina-TTS-nf4

Quantized
(1)
this model