Spaces:
Sleeping
Sleeping
| title: VoiceVerse AI | |
| emoji: ποΈ | |
| colorFrom: indigo | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: "5.23.1" | |
| python_version: "3.10" | |
| app_file: app.py | |
| pinned: false | |
| # ποΈ VoiceVerse AI β Document to Audio | |
| Transform uploaded documents into engaging, emotionally expressive podcast-style audio narrations. | |
| ## Pipeline | |
| ``` | |
| PDF/TXT β Text Extraction β RAG (chunk + embed + retrieve) β Script Generation (Mistral-7B) β TTS (Qwen3-TTS / Edge-TTS) β Audio Playback | |
| ``` | |
| ## Models Used | |
| | Component | Model | How | | |
| |-----------|-------|-----| | |
| | Embeddings | `all-MiniLM-L6-v2` | Local (CPU) | | |
| | Script Gen | `Mistral-7B-Instruct-v0.3` | HF Inference API | | |
| | TTS (primary) | `Qwen3-TTS` | HF Inference API | | |
| | TTS (fallback) | `Edge-TTS (AriaNeural)` | Local (CPU) | | |
| ## Setup | |
| ```bash | |
| pip install -r requirements.txt | |
| export HF_TOKEN="your_huggingface_token_here" | |
| python app.py | |
| ``` | |
| ## Deployment on HF Spaces | |
| 1. Create a new Space (Gradio SDK) | |
| 2. Upload all project files | |
| 3. Set `HF_TOKEN` as a Space Secret | |
| 4. The app will auto-launch on port 7860 | |
| ## Project Structure | |
| ``` | |
| app.py # Gradio UI entry point | |
| rag.py # Document ingestion, chunking, embedding, retrieval | |
| script_gen.py # LLM script generation (Mistral-7B-Instruct) | |
| tts.py # Text-to-speech (Qwen3-TTS + Edge-TTS fallback) | |
| utils.py # Helpers (temp files, validation, error formatting) | |
| requirements.txt # Python dependencies | |
| packages.txt # System packages (ffmpeg) | |
| ``` | |