Orpheus 3B β€” GRPO LoRA for Conversational TTS

LoRA adapter trained with Group Relative Policy Optimization (GRPO) on Orpheus 3B for conversational speech synthesis.

What is Orpheus?

Orpheus is a 3B parameter LLM-based TTS model from Canopy Labs. It generates SNAC audio tokens autoregressively, producing natural speech with emotion and prosody inferred from text context.

Training

  • Base model: canopylabs/orpheus-3b-0.1-ft (3B params, Llama 3 architecture)
  • Method: GRPO (Group Relative Policy Optimization) β€” reinforcement learning with UTMOS as reward signal
  • Adapter: LoRA (r=16, alpha=32, all linear layers)
  • Reward: UTMOS naturalness score (target: maximize perceived quality)
  • Dataset: Expresso conversational speech corpus
  • Hardware: NVIDIA A10G 24GB

Why GRPO?

Standard SFT fine-tuning of TTS models risks overfitting to surface patterns in the training data. GRPO uses a learned reward model (UTMOS) to optimize directly for perceived audio quality, which better aligns with human preferences for naturalness.

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load base Orpheus
base_model = AutoModelForCausalLM.from_pretrained("canopylabs/orpheus-3b-0.1-ft")
tokenizer = AutoTokenizer.from_pretrained("canopylabs/orpheus-3b-0.1-ft")

# Apply GRPO LoRA
model = PeftModel.from_pretrained(base_model, "Tachyeon/orpheus-3b-conversational-grpo")
model = model.merge_and_unload()  # Optional: merge for faster inference

For production inference via llama.cpp GGUF, see Project Maya.

Part of Project Maya

This adapter was trained as part of Project Maya β€” a real-time conversational voice AI system achieving <2s end-to-end latency with:

  • Orpheus 3B TTS via llama.cpp (129 tok/s, RTF 0.64)
  • Llama 3.2 3B LLM (155 tok/s)
  • faster-whisper STT with hallucination mitigation
  • Multi-GPU streaming pipeline (4x A10G)

Repos:

Research Context

This GRPO approach was informed by GLM-4-Voice which demonstrated that RL-based optimization (DPO/GRPO) can improve TTS quality metrics beyond what supervised fine-tuning achieves alone.

Citation

@misc{orpheus2025canopy,
  title={Orpheus TTS},
  author={Canopy Labs},
  year={2025},
  url={https://github.com/canopylabs/orpheus-tts}
}
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Tachyeon/orpheus-3b-conversational-grpo

Dataset used to train Tachyeon/orpheus-3b-conversational-grpo

Paper for Tachyeon/orpheus-3b-conversational-grpo