--- title: VoiceVerse AI emoji: 🎙️ colorFrom: indigo colorTo: purple sdk: gradio sdk_version: "5.23.1" python_version: "3.10" app_file: app.py pinned: false --- # 🎙️ VoiceVerse AI — Document to Audio Transform uploaded documents into engaging, emotionally expressive podcast-style audio narrations. ## Pipeline ``` PDF/TXT → Text Extraction → RAG (chunk + embed + retrieve) → Script Generation (Mistral-7B) → TTS (Qwen3-TTS / Edge-TTS) → Audio Playback ``` ## Models Used | Component | Model | How | |-----------|-------|-----| | Embeddings | `all-MiniLM-L6-v2` | Local (CPU) | | Script Gen | `Mistral-7B-Instruct-v0.3` | HF Inference API | | TTS (primary) | `Qwen3-TTS` | HF Inference API | | TTS (fallback) | `Edge-TTS (AriaNeural)` | Local (CPU) | ## Setup ```bash pip install -r requirements.txt export HF_TOKEN="your_huggingface_token_here" python app.py ``` ## Deployment on HF Spaces 1. Create a new Space (Gradio SDK) 2. Upload all project files 3. Set `HF_TOKEN` as a Space Secret 4. The app will auto-launch on port 7860 ## Project Structure ``` app.py # Gradio UI entry point rag.py # Document ingestion, chunking, embedding, retrieval script_gen.py # LLM script generation (Mistral-7B-Instruct) tts.py # Text-to-speech (Qwen3-TTS + Edge-TTS fallback) utils.py # Helpers (temp files, validation, error formatting) requirements.txt # Python dependencies packages.txt # System packages (ffmpeg) ```