talkie_gentleman

Running

App Files Files Community

jrubiosainz commited on 17 days ago

Commit

0843b60

verified ·

1 Parent(s): 7aef229

Upload folder using huggingface_hub

Browse files

Files changed (10) hide show

README.md +57 -4
index.html +73 -18
pyproject.toml +14 -0
style.css +188 -16
talkie_gentleman/__init__.py +3 -0
talkie_gentleman/main.py +121 -0
talkie_gentleman/robot_behavior.py +127 -0
talkie_gentleman/stt_engine.py +30 -0
talkie_gentleman/talkie_inference.py +51 -0
talkie_gentleman/tts_engine.py +58 -0

README.md CHANGED Viewed

@@ -1,10 +1,63 @@
 ---
 title: Talkie Gentleman
-emoji: 👀
-colorFrom: gray
-colorTo: red
 sdk: static
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 title: Talkie Gentleman
+emoji: 🎩
+colorFrom: purple
+colorTo: gray
 sdk: static
 pinned: false
+tags:
+  - reachy_mini
+  - reachy_mini_python_app
 ---
+# 🎩 Talkie Gentleman
+A next-gen robot channeling the 1920s — powered by [Talkie-1930](https://huggingface.co/talkie-lm/talkie-1930-13b-it) by [@AlecRad](https://twitter.com/AlecRad).
+## What It Does
+Transforms your Reachy Mini into a refined Edwardian gentleman who speaks exclusively in pre-1931 English with a posh British accent.
+## Architecture
+| Stage | Technology |
+|-------|-----------|
+| **Listen** | OpenAI Whisper (local STT) |
+| **Think** | Talkie-1930-13b-it via HuggingFace Inference API |
+| **Speak** | ElevenLabs "Daniel" voice (British male) via `sag` CLI |
+| **Move** | Elegant Victorian gestures — slow nods, thoughtful tilts |
+## Setup
+```bash
+pip install -e .
+export HF_TOKEN="your_huggingface_token"
+```
+## Usage
+```bash
+# Text mode (testing)
+python -m talkie_gentleman "I say, what is the weather today?"
+# Full pipeline with audio
+# (integrates with Reachy Mini SDK for robot control)
+```
+## Robot Behavior
+The gentleman never rushes. All movements are:
+- **Slow** — deliberate, composed
+- **Subtle** — no frantic waving
+- **Dignified** — antenna raises gently when speaking
+- **Thoughtful** — slight head tilt when listening, downward gaze when thinking
+## Environment Variables
+- `HF_TOKEN` — HuggingFace API token (required for Talkie-1930 inference)
+## Credits
+- **Talkie-1930** model by [@AlecRad](https://twitter.com/AlecRad)
+- **ElevenLabs** for the distinguished British voice
+- Built for [Reachy Mini](https://www.pollen-robotics.com/) by Pollen Robotics

index.html CHANGED Viewed

@@ -1,19 +1,74 @@
-<!doctype html>
-<html>
-	<head>
-		<meta charset="utf-8" />
-		<meta name="viewport" content="width=device-width" />
-		<title>My static Space</title>
-		<link rel="stylesheet" href="style.css" />
-	</head>
-	<body>
-		<div class="card">
-			<h1>Welcome to your static Space!</h1>
-			<p>You can modify this app directly by editing <i>index.html</i> in the Files and versions tab.</p>
-			<p>
-				Also don't forget to check the
-				<a href="https://huggingface.co/docs/hub/spaces" target="_blank">Spaces documentation</a>.
-			</p>
-		</div>
-	</body>
 </html>

+<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Talkie Gentleman — Reachy Mini</title>
+    <link rel="stylesheet" href="style.css">
+    <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&family=Playfair+Display:ital,wght@0,700;1,400&display=swap" rel="stylesheet">
+</head>
+<body>
+    <div class="container">
+        <header>
+            <div class="badge">REACHY MINI APP</div>
+            <h1>🎩 Talkie Gentleman</h1>
+            <p class="tagline">A next-gen robot channeling the 1920s</p>
+        </header>
+        <section class="hero">
+            <p class="quote">"I say, dear fellow — permit me to offer my most considered opinion on the matter at hand."</p>
+        </section>
+        <section class="features">
+            <div class="card">
+                <div class="card-icon">🗣️</div>
+                <h3>Victorian Speech</h3>
+                <p>Powered by Talkie-1930 — responses use only pre-1931 vocabulary and cultural references.</p>
+            </div>
+            <div class="card">
+                <div class="card-icon">🇬🇧</div>
+                <h3>Posh British Voice</h3>
+                <p>ElevenLabs "Daniel" voice delivers every response with impeccable received pronunciation.</p>
+            </div>
+            <div class="card">
+                <div class="card-icon">🤖</div>
+                <h3>Elegant Gestures</h3>
+                <p>Slow, deliberate movements — thoughtful nods, gentle tilts, dignified antenna raises.</p>
+            </div>
+            <div class="card">
+                <div class="card-icon">📚</div>
+                <h3>Period Knowledge</h3>
+                <p>Erudite conversation spanning literature, science, politics, and society of the era.</p>
+            </div>
+        </section>
+        <section class="tech">
+            <h2>How It Works</h2>
+            <div class="pipeline">
+                <div class="step">
+                    <span class="step-num">1</span>
+                    <span>You speak</span>
+                    <span class="step-tech">Whisper STT</span>
+                </div>
+                <div class="arrow">→</div>
+                <div class="step">
+                    <span class="step-num">2</span>
+                    <span>Gentleman thinks</span>
+                    <span class="step-tech">Talkie-1930 via HF API</span>
+                </div>
+                <div class="arrow">→</div>
+                <div class="step">
+                    <span class="step-num">3</span>
+                    <span>Robot responds</span>
+                    <span class="step-tech">ElevenLabs TTS + Gestures</span>
+                </div>
+            </div>
+        </section>
+        <footer>
+            <p>Built with <a href="https://huggingface.co/talkie-lm/talkie-1930-13b-it">Talkie-1930</a> by <a href="https://twitter.com/AlecRad">@AlecRad</a></p>
+            <p class="subtle">For Reachy Mini by Pollen Robotics</p>
+        </footer>
+    </div>
+</body>
 </html>

pyproject.toml ADDED Viewed

	@@ -0,0 +1,14 @@

+[project]
+name = "talkie-gentleman"
+version = "0.1.0"
+description = "A Victorian/Edwardian speaking robot app for Reachy Mini using Talkie-1930"
+requires-python = ">=3.9"
+dependencies = [
+    "huggingface_hub>=0.20.0",
+    "openai-whisper>=20231117",
+    "numpy",
+]
+[build-system]
+requires = ["setuptools>=68.0"]
+build-backend = "setuptools.backends._legacy:_Backend"

style.css CHANGED Viewed

@@ -1,28 +1,200 @@
 body {
-	padding: 2rem;
-	font-family: -apple-system, BlinkMacSystemFont, "Arial", sans-serif;
 }
 h1 {
-	font-size: 16px;
-	margin-top: 0;
 }
-p {
-	color: rgb(107, 114, 128);
-	font-size: 15px;
-	margin-bottom: 10px;
-	margin-top: 5px;
 }
 .card {
-	max-width: 620px;
-	margin: 0 auto;
-	padding: 16px;
-	border: 1px solid lightgray;
-	border-radius: 16px;
 }
-.card p:last-child {
-	margin-bottom: 0;
 }

+:root {
+    --bg: #0f0e17;
+    --surface: #1a1825;
+    --border: #2d2b3a;
+    --text: #fffffe;
+    --text-muted: #a7a9be;
+    --accent: #c9a84c;
+    --accent-dark: #8b6914;
+    --burgundy: #6b2d3e;
+}
+* {
+    margin: 0;
+    padding: 0;
+    box-sizing: border-box;
+}
 body {
+    font-family: 'Inter', sans-serif;
+    background: var(--bg);
+    color: var(--text);
+    min-height: 100vh;
+    line-height: 1.6;
+}
+.container {
+    max-width: 900px;
+    margin: 0 auto;
+    padding: 4rem 2rem;
+}
+header {
+    text-align: center;
+    margin-bottom: 3rem;
+}
+.badge {
+    display: inline-block;
+    font-size: 0.7rem;
+    font-weight: 600;
+    letter-spacing: 0.15em;
+    color: var(--accent);
+    border: 1px solid var(--accent-dark);
+    padding: 0.3rem 0.8rem;
+    border-radius: 20px;
+    margin-bottom: 1rem;
 }
 h1 {
+    font-family: 'Playfair Display', serif;
+    font-size: 3rem;
+    margin-bottom: 0.5rem;
+    background: linear-gradient(135deg, var(--accent), #f0d78c);
+    -webkit-background-clip: text;
+    -webkit-text-fill-color: transparent;
+}
+.tagline {
+    font-size: 1.2rem;
+    color: var(--text-muted);
+    font-style: italic;
 }
+.hero {
+    text-align: center;
+    margin: 3rem 0;
+}
+.quote {
+    font-family: 'Playfair Display', serif;
+    font-style: italic;
+    font-size: 1.3rem;
+    color: var(--text-muted);
+    border-left: 3px solid var(--accent);
+    padding-left: 1.5rem;
+    max-width: 600px;
+    margin: 0 auto;
+    text-align: left;
+}
+.features {
+    display: grid;
+    grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
+    gap: 1.5rem;
+    margin: 3rem 0;
 }
 .card {
+    background: var(--surface);
+    border: 1px solid var(--border);
+    border-radius: 12px;
+    padding: 1.5rem;
+    transition: border-color 0.3s, transform 0.2s;
+}
+.card:hover {
+    border-color: var(--accent-dark);
+    transform: translateY(-2px);
+}
+.card-icon {
+    font-size: 2rem;
+    margin-bottom: 0.8rem;
+}
+.card h3 {
+    font-size: 1rem;
+    margin-bottom: 0.5rem;
+    color: var(--accent);
+}
+.card p {
+    font-size: 0.85rem;
+    color: var(--text-muted);
+}
+.tech {
+    margin: 4rem 0;
+    text-align: center;
+}
+.tech h2 {
+    font-family: 'Playfair Display', serif;
+    font-size: 1.8rem;
+    margin-bottom: 2rem;
+    color: var(--accent);
+}
+.pipeline {
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    gap: 1rem;
+    flex-wrap: wrap;
+}
+.step {
+    background: var(--surface);
+    border: 1px solid var(--border);
+    border-radius: 10px;
+    padding: 1.2rem;
+    display: flex;
+    flex-direction: column;
+    align-items: center;
+    gap: 0.3rem;
+    min-width: 160px;
+}
+.step-num {
+    background: var(--accent);
+    color: var(--bg);
+    width: 24px;
+    height: 24px;
+    border-radius: 50%;
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    font-size: 0.75rem;
+    font-weight: 700;
+}
+.step-tech {
+    font-size: 0.7rem;
+    color: var(--text-muted);
+}
+.arrow {
+    color: var(--accent);
+    font-size: 1.5rem;
+}
+footer {
+    text-align: center;
+    margin-top: 4rem;
+    padding-top: 2rem;
+    border-top: 1px solid var(--border);
+    color: var(--text-muted);
+    font-size: 0.85rem;
+}
+footer a {
+    color: var(--accent);
+    text-decoration: none;
+}
+footer a:hover {
+    text-decoration: underline;
+}
+.subtle {
+    font-size: 0.75rem;
+    margin-top: 0.5rem;
+    opacity: 0.6;
 }
+@media (max-width: 600px) {
+    h1 { font-size: 2rem; }
+    .pipeline { flex-direction: column; }
+    .arrow { transform: rotate(90deg); }
 }

talkie_gentleman/__init__.py ADDED Viewed

	@@ -0,0 +1,3 @@


1	+ """Talkie Gentleman — A next-gen robot channeling the 1920s."""
2	+
3	+ __version__ = "0.1.0"

talkie_gentleman/main.py ADDED Viewed

	@@ -0,0 +1,121 @@

+"""Talkie Gentleman — Reachy Mini app entry point.
+A Victorian/Edwardian speaking robot using:
+- Whisper STT for listening
+- Talkie-1930 (HF Inference API) for period-accurate responses
+- ElevenLabs (sag) for posh British TTS
+- Elegant slow gestures befitting a gentleman
+"""
+import asyncio
+import logging
+import tempfile
+from pathlib import Path
+from .stt_engine import transcribe
+from .talkie_inference import generate_response
+from .tts_engine import speak_british
+from .robot_behavior import GentlemanBehavior
+logger = logging.getLogger(__name__)
+# Conversation history (kept short for context window)
+MAX_HISTORY = 10
+class TalkieGentleman:
+    """Main application class for the Victorian gentleman robot."""
+    def __init__(self):
+        self.behavior = GentlemanBehavior()
+        self.history: list[dict] = []
+    async def process_audio(self, audio_path: str) -> dict:
+        """Full pipeline: STT → LLM → TTS → Gestures."""
+        # 1. Listen — transcribe user speech
+        logger.info("Listening attentively...")
+        gestures_listen = self.behavior.on_user_speaking()
+        user_text = await asyncio.to_thread(transcribe, audio_path)
+        if not user_text:
+            return {"error": "I beg your pardon? I did not quite catch that."}
+        logger.info(f"Heard: {user_text}")
+        # 2. Think — generate Victorian response
+        logger.info("Contemplating a proper response...")
+        gestures_think = self.behavior.on_processing()
+        response = await asyncio.to_thread(generate_response, user_text, self.history)
+        # Update history
+        self.history.append({"role": "user", "content": user_text})
+        self.history.append({"role": "assistant", "content": response})
+        if len(self.history) > MAX_HISTORY * 2:
+            self.history = self.history[-MAX_HISTORY * 2:]
+        logger.info(f"Response: {response}")
+        # 3. Speak — synthesize British voice
+        logger.info("Delivering with proper elocution...")
+        gestures_speak = self.behavior.on_speaking(response)
+        output_audio = tempfile.mktemp(suffix=".wav")
+        audio_path_out = await asyncio.to_thread(speak_british, response, output_audio)
+        # 4. Return to idle
+        gestures_idle = self.behavior.on_idle()
+        return {
+            "user_text": user_text,
+            "response_text": response,
+            "audio_path": audio_path_out,
+            "gestures": {
+                "listening": gestures_listen,
+                "thinking": gestures_think,
+                "speaking": gestures_speak,
+                "idle": gestures_idle,
+            },
+        }
+    async def process_text(self, user_text: str) -> dict:
+        """Text-only pipeline (skip STT)."""
+        gestures_think = self.behavior.on_processing()
+        response = await asyncio.to_thread(generate_response, user_text, self.history)
+        self.history.append({"role": "user", "content": user_text})
+        self.history.append({"role": "assistant", "content": response})
+        if len(self.history) > MAX_HISTORY * 2:
+            self.history = self.history[-MAX_HISTORY * 2:]
+        gestures_speak = self.behavior.on_speaking(response)
+        output_audio = tempfile.mktemp(suffix=".wav")
+        audio_path_out = await asyncio.to_thread(speak_british, response, output_audio)
+        return {
+            "response_text": response,
+            "audio_path": audio_path_out,
+            "gestures": {
+                "thinking": gestures_think,
+                "speaking": gestures_speak,
+            },
+        }
+def main():
+    """CLI entry point for testing."""
+    import sys
+    app = TalkieGentleman()
+    if len(sys.argv) > 1:
+        text = " ".join(sys.argv[1:])
+        result = asyncio.run(app.process_text(text))
+        print(f"🎩 {result['response_text']}")
+        print(f"🔊 Audio: {result['audio_path']}")
+    else:
+        print("🎩 Talkie Gentleman — Victorian Conversationalist")
+        print("Usage: python -m talkie_gentleman 'Good evening, sir!'")
+if __name__ == "__main__":
+    main()

talkie_gentleman/robot_behavior.py ADDED Viewed

	@@ -0,0 +1,127 @@

+"""Victorian gentleman gestures for Reachy Mini — elegant, slow, dignified."""
+import time
+import math
+from dataclasses import dataclass
+# Gesture speed multiplier (lower = slower/more elegant)
+ELEGANCE_FACTOR = 0.4
+@dataclass
+class Pose:
+    """Joint angles in degrees."""
+    head_pitch: float = 0.0  # nod axis
+    head_yaw: float = 0.0    # turn axis
+    head_roll: float = 0.0   # tilt axis
+    antenna: float = 0.0     # antenna position (0-1)
+# Pre-defined poses
+NEUTRAL = Pose(head_pitch=-5.0, head_yaw=0.0, head_roll=0.0, antenna=0.3)
+LISTENING = Pose(head_pitch=-3.0, head_yaw=5.0, head_roll=4.0, antenna=0.2)
+THINKING = Pose(head_pitch=-10.0, head_yaw=-3.0, head_roll=0.0, antenna=0.5)
+SPEAKING = Pose(head_pitch=0.0, head_yaw=0.0, head_roll=0.0, antenna=0.8)
+ACKNOWLEDGING = Pose(head_pitch=-8.0, head_yaw=8.0, head_roll=2.0, antenna=0.4)
+def interpolate_pose(current: Pose, target: Pose, t: float) -> Pose:
+    """Smooth cubic interpolation between poses."""
+    # Ease-in-out cubic
+    t = t * t * (3.0 - 2.0 * t)
+    return Pose(
+        head_pitch=current.head_pitch + (target.head_pitch - current.head_pitch) * t,
+        head_yaw=current.head_yaw + (target.head_yaw - current.head_yaw) * t,
+        head_roll=current.head_roll + (target.head_roll - current.head_roll) * t,
+        antenna=current.antenna + (target.antenna - current.antenna) * t,
+    )
+def gentle_nod(amplitude: float = 5.0, duration: float = 1.5) -> list[Pose]:
+    """Generate a thoughtful nod sequence."""
+    frames = []
+    steps = int(duration / 0.05)
+    for i in range(steps):
+        t = i / steps
+        pitch_offset = amplitude * math.sin(t * math.pi * 2) * (1 - t * 0.3)
+        frames.append(Pose(head_pitch=NEUTRAL.head_pitch + pitch_offset, antenna=0.6))
+    return frames
+def aristocratic_turn(direction: float = 1.0, duration: float = 2.0) -> list[Pose]:
+    """Slow, deliberate head turn — acknowledging someone's presence."""
+    frames = []
+    steps = int(duration / 0.05)
+    target_yaw = 15.0 * direction
+    for i in range(steps):
+        t = i / steps
+        t_smooth = t * t * (3.0 - 2.0 * t)  # ease-in-out
+        frames.append(Pose(
+            head_yaw=target_yaw * t_smooth,
+            head_pitch=NEUTRAL.head_pitch,
+            antenna=0.3 + 0.2 * t_smooth,
+        ))
+    return frames
+def thinking_sequence(duration: float = 3.0) -> list[Pose]:
+    """Contemplative pose — slight downward look, subtle sway."""
+    frames = []
+    steps = int(duration / 0.05)
+    for i in range(steps):
+        t = i / steps
+        sway = 2.0 * math.sin(t * math.pi * 0.8)
+        frames.append(Pose(
+            head_pitch=THINKING.head_pitch + sway * 0.3,
+            head_yaw=THINKING.head_yaw + sway,
+            head_roll=sway * 0.5,
+            antenna=0.4 + 0.1 * math.sin(t * math.pi * 2),
+        ))
+    return frames
+def speaking_animation(text_length: int) -> list[Pose]:
+    """Subtle movements while speaking — engaged but not distracting."""
+    duration = min(max(text_length * 0.05, 2.0), 8.0)
+    frames = []
+    steps = int(duration / 0.05)
+    for i in range(steps):
+        t = i / steps
+        # Very subtle head movements
+        pitch = SPEAKING.head_pitch + 1.5 * math.sin(t * math.pi * 3)
+        yaw = 2.0 * math.sin(t * math.pi * 1.5)
+        frames.append(Pose(
+            head_pitch=pitch,
+            head_yaw=yaw,
+            antenna=SPEAKING.antenna + 0.1 * math.sin(t * math.pi * 4),
+        ))
+    return frames
+# State machine for the gentleman's behavior
+class GentlemanBehavior:
+    """Manages the robot's state transitions with Victorian poise."""
+    def __init__(self):
+        self.current_pose = NEUTRAL
+        self.state = "idle"
+    def on_user_speaking(self) -> list[Pose]:
+        """User is talking — listen attentively."""
+        self.state = "listening"
+        return [interpolate_pose(self.current_pose, LISTENING, t / 20) for t in range(20)]
+    def on_processing(self) -> list[Pose]:
+        """Thinking of a response — contemplative pose."""
+        self.state = "thinking"
+        return thinking_sequence()
+    def on_speaking(self, response_text: str) -> list[Pose]:
+        """Delivering response — engaged posture with antenna raised."""
+        self.state = "speaking"
+        return speaking_animation(len(response_text))
+    def on_idle(self) -> list[Pose]:
+        """Return to neutral — composed waiting."""
+        self.state = "idle"
+        return [interpolate_pose(self.current_pose, NEUTRAL, t / 30) for t in range(30)]

talkie_gentleman/stt_engine.py ADDED Viewed

	@@ -0,0 +1,30 @@

+"""Speech-to-Text engine using OpenAI Whisper."""
+import whisper
+import numpy as np
+_model = None
+def _get_model(model_size: str = "base"):
+    global _model
+    if _model is None:
+        _model = whisper.load_model(model_size)
+    return _model
+def transcribe(audio_path: str, language: str = "en") -> str:
+    """Transcribe audio file to text using Whisper."""
+    model = _get_model()
+    result = model.transcribe(audio_path, language=language)
+    return result["text"].strip()
+def transcribe_array(audio_array: np.ndarray, sample_rate: int = 16000, language: str = "en") -> str:
+    """Transcribe numpy audio array to text."""
+    model = _get_model()
+    # Whisper expects float32 mono at 16kHz
+    if audio_array.dtype != np.float32:
+        audio_array = audio_array.astype(np.float32)
+    result = model.transcribe(audio_array, language=language)
+    return result["text"].strip()

talkie_gentleman/talkie_inference.py ADDED Viewed

	@@ -0,0 +1,51 @@

+"""HuggingFace Inference API wrapper for Talkie-1930 model."""
+import os
+from huggingface_hub import InferenceClient
+# Primary: Talkie model via HF Inference API
+# Fallback: Llama 3.1 8B Instruct with Victorian system prompt
+PRIMARY_MODEL = "talkie-lm/talkie-1930-13b-it"
+FALLBACK_MODEL = "meta-llama/Llama-3.1-8B-Instruct"
+SYSTEM_PROMPT = (
+    "You are a refined Edwardian gentleman from 1920s England. "
+    "Respond with impeccable manners, using only vocabulary and cultural references from before 1931. "
+    "Be witty, erudite, and slightly pompous. Address others as \"dear fellow\", \"my good sir\" or \"my good madam\". "
+    "Keep responses concise (2-3 sentences) for natural conversation."
+)
+_client: InferenceClient | None = None
+def _get_client() -> InferenceClient:
+    global _client
+    if _client is None:
+        token = os.environ.get("HF_TOKEN")
+        _client = InferenceClient(token=token)
+    return _client
+def generate_response(user_text: str, history: list[dict] | None = None) -> str:
+    """Generate a Victorian-style response to user input via HF Inference API."""
+    client = _get_client()
+    messages = [{"role": "system", "content": SYSTEM_PROMPT}]
+    if history:
+        messages.extend(history)
+    messages.append({"role": "user", "content": user_text})
+    # Try Talkie model first, fallback to Llama 3.1
+    for model_id in [PRIMARY_MODEL, FALLBACK_MODEL]:
+        try:
+            response = client.chat_completion(
+                messages=messages,
+                model=model_id,
+                max_tokens=150,
+                temperature=0.7,
+            )
+            return response.choices[0].message.content.strip()
+        except Exception:
+            continue
+    return "I beg your pardon, dear fellow — my faculties appear momentarily indisposed."

talkie_gentleman/tts_engine.py ADDED Viewed

	@@ -0,0 +1,58 @@

+"""British TTS engine — ElevenLabs via sag CLI, with Piper fallback."""
+import os
+import subprocess
+import shutil
+from pathlib import Path
+DEFAULT_OUTPUT = "/tmp/gentleman_response.mp3"
+ELEVENLABS_VOICE = "Daniel"  # Posh British male
+def _get_elevenlabs_key() -> str | None:
+    """Read ElevenLabs API key from env or file."""
+    key = os.environ.get("ELEVENLABS_API_KEY")
+    if key:
+        return key
+    key_file = Path.home() / ".openclaw" / ".elevenlabs-key"
+    if key_file.exists():
+        return key_file.read_text().strip()
+    return None
+def speak_british(text: str, output_path: str = DEFAULT_OUTPUT) -> str:
+    """Synthesize speech in a posh British voice. Returns path to audio file."""
+    sag = shutil.which("sag")
+    api_key = _get_elevenlabs_key()
+    if sag and api_key:
+        env = os.environ.copy()
+        env["ELEVENLABS_API_KEY"] = api_key
+        subprocess.run(
+            [sag, "speak", "-v", ELEVENLABS_VOICE, text, "--no-play", "--output", output_path],
+            check=True,
+            capture_output=True,
+            env=env,
+        )
+        return output_path
+    # Fallback: Piper with en_GB voice
+    piper = shutil.which("piper") or str(Path.home() / "Library/Python/3.9/bin/piper")
+    en_gb_model = str(Path.home() / "clawd/tts-voices/en_GB-alan-medium.onnx")
+    if not Path(en_gb_model).exists():
+        # Try any en_GB model available
+        voices_dir = Path.home() / "clawd/tts-voices"
+        en_gb_models = list(voices_dir.glob("en_GB-*.onnx")) if voices_dir.exists() else []
+        if en_gb_models:
+            en_gb_model = str(en_gb_models[0])
+        else:
+            raise RuntimeError("No British TTS voice available (install sag or en_GB Piper model)")
+    wav_path = output_path if output_path.endswith(".wav") else output_path + ".wav"
+    proc = subprocess.run(
+        [piper, "--model", en_gb_model, "--output_file", wav_path],
+        input=text.encode(),
+        capture_output=True,
+    )
+    if proc.returncode != 0:
+        raise RuntimeError(f"Piper TTS failed: {proc.stderr.decode()}")
+    return wav_path