Piper Sarah Atlas β€” en_US-sarah-atlas

A custom Piper TTS voice model fine-tuned to sound like ElevenLabs Sarah β€” the Atlas assistant's phone voice.

Purpose

Atlas uses ElevenLabs Sarah for phone calls (premium, cloud) and this model for on-device TTS on desktop, iOS, and Android. Both should sound like the same voice.

Model Details

Property Value
Base checkpoint en_US-lessac-medium (epoch 2164)
Fine-tuned on 1,500 utterances (~1 hour) of ElevenLabs Sarah audio
TTS model used eleven_turbo_v2_5, stability=0.6, similarity_boost=0.8
Architecture VITS (Piper medium)
Output sample rate 22,050 Hz
ONNX size ~20MB
Training GPU A100 40GB (GCP spot, ~$12 total)
Training epochs 1,500

Usage

CLI

echo "Hi, this is Atlas. How can I help you today?" | \
    piper -m en_US-sarah-atlas.onnx --output_file output.wav

Streaming (raw PCM)

echo "You have 3 urgent emails." | \
    piper -m en_US-sarah-atlas.onnx --output_raw | aplay -r 22050 -f S16_LE -c 1

Python

import subprocess
result = subprocess.run(
    ["piper", "-m", "en_US-sarah-atlas.onnx", "--output_raw"],
    input=b"Let me check your calendar.",
    capture_output=True
)
pcm_audio = result.stdout  # 22050Hz mono 16-bit PCM

Training Data

  • ~279 sentences extracted from Atlas codebase (pattern_mapping.py, acknowledgments.py)
  • ~1,221 sentences generated via ElevenLabs Sarah API (eleven_turbo_v2_5)
  • ASR-validated with Whisper (rejected samples with <90% transcript match)
  • Corpus sources: Atlas tool commands, acknowledgments, LJSpeech phoneme coverage, numbers/dates/names, conversational filler, news/Wikipedia extracts

Feature

Feature 099 β€” Piper Sarah Voice Training Branch: 099-piper-sarah-voice Training framework doc: specs/099-piper-sarah-voice/training-framework.md

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support