Spaces:
Running
Running
File size: 6,356 Bytes
5f3e9f5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 | ---
title: YT AI Automation
emoji: π₯
colorFrom: blue
colorTo: red
sdk: docker
app_port: 7860
pinned: false
---
# TextBro β Text β Video Studio
Turn text, raw HTML, images, or PDFs into video-ready screenshots using AI.
- **Backend**: Flask + Playwright (Python) β originally
[Screenshot Studio](https://github.com/shiv12345678901/yt-project).
- **Frontend**: React + Vite + TypeScript + Tailwind CSS.
- **Features**: live SSE progress, cancel, screenshot gallery, ZIP download,
history, cache inspection. On Windows, the backend can also stitch
screenshots into a PowerPoint-driven video.
```
Devin_project/
βββ backend/ # Flask app, routes, Playwright screenshot engine
β βββ app.py
β βββ start.py
β βββ requirements.txt
β βββ config/
β βββ routes/
β βββ src/
βββ frontend/ # React SPA
βββ src/
βββ package.json
βββ vite.config.ts
```
## Requirements
- **Python** 3.10+ (3.11 recommended)
- **Node.js** 20.19+ or 22.13+
- **Playwright's Chromium** (installed via `playwright install chromium`)
- An API key for an OpenAI-compatible LLM endpoint (Groq, Together, OpenAI,
a local `llama.cpp` server, etc.) β the backend uses chat completions.
- **Optional (Windows only)** Microsoft PowerPoint, for the
screenshot β video pipeline.
## First-time setup
```bash
# 1) Clone
git clone https://github.com/shiv12345678901/Devin_project.git
cd Devin_project
```
### Backend
```bash
cd backend
# (Optional but recommended) create a virtualenv
python -m venv .venv
# Windows: .venv\Scripts\activate
# macOS/Linux: source .venv/bin/activate
pip install -r requirements.txt
playwright install chromium
# Fill in your API credentials
cp config/config.example.py config/config.py
# Edit config/config.py:
# API_KEY = "sk-..." # your LLM API key
# API_URL = "https://api.groq.com/openai/v1" # or wherever
# MODEL = "llama-3.1-70b-versatile"
```
### Frontend
```bash
cd ../frontend
npm install
```
## Running it
You have two options.
### Option A β dev mode (two terminals, hot reload everywhere)
```bash
# Terminal 1
cd backend && python start.py # http://localhost:5000
# Terminal 2
cd frontend && npm run dev # http://localhost:5173
```
Open http://localhost:5173 β the Vite dev server proxies every API path to
the Flask backend so CORS isn't an issue. Changes to React are hot-reloaded.
### Option B β single server (Flask serves the built React app)
```bash
cd frontend && npm run build # produces frontend/dist/
cd ../backend && python start.py # http://localhost:5000
```
Now Flask serves the UI and the API from one port, so this is also the
setup you'd use when pointing a tunnel (ngrok, Cloudflare Tunnel) at it.
## What's wired to what
| Frontend page | Backend endpoint | Notes |
| ----------------- | ----------------------------------- | ----------------------------------- |
| Text β Video | `POST /generate-sse` | SSE progress, cancel via `/cancel/<op>` |
| HTML β Video | `POST /generate-html`, `/beautify`, `/minify` | Synchronous |
| Image/PDF β Video | `POST /image-to-screenshots-sse` | SSE progress, OCR + AI + screenshots |
| Resources | `GET /list`, `/history`, `/cache/stats`, `DELETE /delete/<type>/<name>`, `POST /cache/clear` | β |
| Gallery | `GET /screenshots/<path>` | Served by Flask |
| ZIP download | `POST /download-zip` | Streams a ZIP of selected files |
The full API client is in
[`frontend/src/api/client.ts`](frontend/src/api/client.ts) and the SSE
state machine in
[`frontend/src/hooks/useGenerate.ts`](frontend/src/hooks/useGenerate.ts).
## Configuration reference
Key values in `backend/config/config.py` (see
`backend/config/config.example.py` for the full list):
| Setting | What it controls |
| ------------------------------ | -------------------------------------------- |
| `API_KEY`, `API_URL`, `MODEL` | Which LLM the backend talks to (chat completions) |
| `PORT`, `HOST` | Flask listen address |
| `DEFAULT_VIEWPORT_WIDTH/HEIGHT`| Screenshot viewport |
| `DEFAULT_ZOOM`, `DEFAULT_OVERLAP` | Capture scaling and slide overlap |
| `MAX_SCREENSHOTS_LIMIT` | Hard cap on screenshots per run |
| `POWERPOINT_*` | Windows-only PowerPoint/video export |
| `VIDEO_*` | Resolution / FPS / quality for PPT β video |
## Scripts
**Frontend** (inside `frontend/`)
| Command | Description |
| ----------------- | ---------------------------------------- |
| `npm run dev` | Start Vite dev server with API proxy |
| `npm run build` | TypeScript + production build to `dist/` |
| `npm run preview` | Preview the production build locally |
| `npm run lint` | Run ESLint |
**Backend** (inside `backend/`)
| Command | Description |
| --------------------- | ---------------------------------------------------- |
| `python start.py` | Launch the Flask app with env checks |
| `python app.py` | Launch the Flask app directly (skip env checks) |
## Troubleshooting
- **`Configuration file not found`** when starting the backend β you didn't
copy `config/config.example.py` to `config/config.py`.
- **Generation returns 500 / `Failed to get AI response`** β the API key or
base URL in `config.py` is wrong, or the model isn't available from that
endpoint.
- **Screenshots are blank** β run `playwright install chromium` again.
- **`/assets/...` 404 on Option B** β rebuild the frontend after code
changes (`cd frontend && npm run build`).
- **Video export fails on macOS/Linux** β the PowerPoint exporter is
Windows-only. Screenshots still work on all platforms.
## Credits
Based on [Screenshot Studio](https://github.com/shiv12345678901/yt-project)
by Educated Nepal. Original stack: Flask + Playwright + Llama 3.1 70B.
|