Piper Sarah Atlas — en_US-sarah-atlas

A custom Piper TTS voice model fine-tuned to sound like ElevenLabs Sarah — the Atlas assistant's phone voice.

Purpose

Atlas uses ElevenLabs Sarah for phone calls (premium, cloud) and this model for on-device TTS on desktop, iOS, and Android. Both should sound like the same voice.

Model Details

Property	Value
Base checkpoint	`en_US-lessac-medium` (epoch 2164)
Fine-tuned on	1,500 utterances (~1 hour) of ElevenLabs Sarah audio
TTS model used	`eleven_turbo_v2_5`, stability=0.6, similarity_boost=0.8
Architecture	VITS (Piper medium)
Output sample rate	22,050 Hz
ONNX size	~20MB
Training GPU	A100 40GB (GCP spot, ~$12 total)
Training epochs	1,500

Usage

CLI

echo "Hi, this is Atlas. How can I help you today?" | \
    piper -m en_US-sarah-atlas.onnx --output_file output.wav

Streaming (raw PCM)

echo "You have 3 urgent emails." | \
    piper -m en_US-sarah-atlas.onnx --output_raw | aplay -r 22050 -f S16_LE -c 1

Python

import subprocess
result = subprocess.run(
    ["piper", "-m", "en_US-sarah-atlas.onnx", "--output_raw"],
    input=b"Let me check your calendar.",
    capture_output=True
)
pcm_audio = result.stdout  # 22050Hz mono 16-bit PCM

Training Data

~279 sentences extracted from Atlas codebase (pattern_mapping.py, acknowledgments.py)
~1,221 sentences generated via ElevenLabs Sarah API (eleven_turbo_v2_5)
ASR-validated with Whisper (rejected samples with <90% transcript match)
Corpus sources: Atlas tool commands, acknowledgments, LJSpeech phoneme coverage, numbers/dates/names, conversational filler, news/Wikipedia extracts

Feature

Feature 099 — Piper Sarah Voice Training Branch: 099-piper-sarah-voice Training framework doc: specs/099-piper-sarah-voice/training-framework.md

Downloads last month: -; Downloads are not tracked for this model. How to track