videovoice-dramabox / README.md
github-actions[bot]
deploy: switch to dramabox requirements @ 93b2b51
ee10bf8

A newer version of the Gradio SDK is available: 6.14.0

Upgrade
metadata
title: VideoVoice Dramabox
emoji: 🎭
colorFrom: red
colorTo: indigo
sdk: gradio
sdk_version: 5.7.1
app_file: app.py
python_version: '3.10'
pinned: true
short_description: Resemble Dramabox β€” directable speech for VideoVoice

VideoVoice β€” Dramabox

Resemble AI's directable speech engine, mounted as a VideoVoice tool tab.

Endpoint: POST /api/tools/dramabox Frontend: /app/dramabox

What's different from the other Spaces

This Space is a tools-only Space:

  • The /api/tools/dramabox endpoint runs Resemble Dramabox against a scene prompt (quoted dialogue + stage directions outside quotes).
  • Other pipeline endpoints (dub, voice-clone, subtitles, audio-cleanup) are defensively reachable but the frontend never routes traffic here for them.

Prompt grammar

<speaker description>, "<dialogue>" <action> "<more dialogue>"
  • Inside quotes is spoken: "Hello, how are you?", phonetics like "Hahaha", "Mmmmm".
  • Outside quotes is a stage direction: She sighs deeply., He clears his throat.
  • Avoid writing onomatopoeia (Sigh, Ahem, Gasp) inside quotes β€” the model will speak them literally.

Setup notes

Required Space Secrets:

  • TTS_ENGINE=dramabox
  • HF_TOKEN (same as the other VideoVoice Spaces β€” for model downloads)
  • LTX_DTYPE=bf16 (optional, matches upstream default)

Required vendored source (committed to the BE repo, deployed via deploy.sh):

  • dramabox_src/ β€” copy of ResembleAI/Dramabox src/. The tools_api/dramabox.py worker adds this to sys.path lazily on first request.

Acknowledgements

Built on Resemble AI's Dramabox. All generated audio is invisibly watermarked with Resemble PerTh.