Orpheus 3B β GRPO LoRA for Conversational TTS
LoRA adapter trained with Group Relative Policy Optimization (GRPO) on Orpheus 3B for conversational speech synthesis.
What is Orpheus?
Orpheus is a 3B parameter LLM-based TTS model from Canopy Labs. It generates SNAC audio tokens autoregressively, producing natural speech with emotion and prosody inferred from text context.
Training
- Base model: canopylabs/orpheus-3b-0.1-ft (3B params, Llama 3 architecture)
- Method: GRPO (Group Relative Policy Optimization) β reinforcement learning with UTMOS as reward signal
- Adapter: LoRA (r=16, alpha=32, all linear layers)
- Reward: UTMOS naturalness score (target: maximize perceived quality)
- Dataset: Expresso conversational speech corpus
- Hardware: NVIDIA A10G 24GB
Why GRPO?
Standard SFT fine-tuning of TTS models risks overfitting to surface patterns in the training data. GRPO uses a learned reward model (UTMOS) to optimize directly for perceived audio quality, which better aligns with human preferences for naturalness.
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load base Orpheus
base_model = AutoModelForCausalLM.from_pretrained("canopylabs/orpheus-3b-0.1-ft")
tokenizer = AutoTokenizer.from_pretrained("canopylabs/orpheus-3b-0.1-ft")
# Apply GRPO LoRA
model = PeftModel.from_pretrained(base_model, "Tachyeon/orpheus-3b-conversational-grpo")
model = model.merge_and_unload() # Optional: merge for faster inference
For production inference via llama.cpp GGUF, see Project Maya.
Part of Project Maya
This adapter was trained as part of Project Maya β a real-time conversational voice AI system achieving <2s end-to-end latency with:
- Orpheus 3B TTS via llama.cpp (129 tok/s, RTF 0.64)
- Llama 3.2 3B LLM (155 tok/s)
- faster-whisper STT with hallucination mitigation
- Multi-GPU streaming pipeline (4x A10G)
Repos:
Research Context
This GRPO approach was informed by GLM-4-Voice which demonstrated that RL-based optimization (DPO/GRPO) can improve TTS quality metrics beyond what supervised fine-tuning achieves alone.
Citation
@misc{orpheus2025canopy,
title={Orpheus TTS},
author={Canopy Labs},
year={2025},
url={https://github.com/canopylabs/orpheus-tts}
}
- Downloads last month
- 2
Model tree for Tachyeon/orpheus-3b-conversational-grpo
Base model
meta-llama/Llama-3.2-3B-Instruct