Spaces:
Running
title: YT AI Automation
emoji: π₯
colorFrom: blue
colorTo: red
sdk: docker
app_port: 7860
pinned: false
TextBro β Text β Video Studio
Turn text, raw HTML, images, or PDFs into video-ready screenshots using AI.
- Backend: Flask + Playwright (Python) β originally Screenshot Studio.
- Frontend: React + Vite + TypeScript + Tailwind CSS.
- Features: live SSE progress, cancel, screenshot gallery, ZIP download, history, cache inspection. On Windows, the backend can also stitch screenshots into a PowerPoint-driven video.
Devin_project/
βββ backend/ # Flask app, routes, Playwright screenshot engine
β βββ app.py
β βββ start.py
β βββ requirements.txt
β βββ config/
β βββ routes/
β βββ src/
βββ frontend/ # React SPA
βββ src/
βββ package.json
βββ vite.config.ts
Requirements
- Python 3.10+ (3.11 recommended)
- Node.js 20.19+ or 22.13+
- Playwright's Chromium (installed via
playwright install chromium) - An API key for an OpenAI-compatible LLM endpoint (Groq, Together, OpenAI,
a local
llama.cppserver, etc.) β the backend uses chat completions. - Optional (Windows only) Microsoft PowerPoint, for the screenshot β video pipeline.
First-time setup
# 1) Clone
git clone https://github.com/shiv12345678901/Devin_project.git
cd Devin_project
Backend
cd backend
# (Optional but recommended) create a virtualenv
python -m venv .venv
# Windows: .venv\Scripts\activate
# macOS/Linux: source .venv/bin/activate
pip install -r requirements.txt
playwright install chromium
# Fill in your API credentials
cp config/config.example.py config/config.py
# Edit config/config.py:
# API_KEY = "sk-..." # your LLM API key
# API_URL = "https://api.groq.com/openai/v1" # or wherever
# MODEL = "llama-3.1-70b-versatile"
Frontend
cd ../frontend
npm install
Running it
You have two options.
Option A β dev mode (two terminals, hot reload everywhere)
# Terminal 1
cd backend && python start.py # http://localhost:5000
# Terminal 2
cd frontend && npm run dev # http://localhost:5173
Open http://localhost:5173 β the Vite dev server proxies every API path to the Flask backend so CORS isn't an issue. Changes to React are hot-reloaded.
Option B β single server (Flask serves the built React app)
cd frontend && npm run build # produces frontend/dist/
cd ../backend && python start.py # http://localhost:5000
Now Flask serves the UI and the API from one port, so this is also the setup you'd use when pointing a tunnel (ngrok, Cloudflare Tunnel) at it.
What's wired to what
| Frontend page | Backend endpoint | Notes |
|---|---|---|
| Text β Video | POST /generate-sse |
SSE progress, cancel via /cancel/<op> |
| HTML β Video | POST /generate-html, /beautify, /minify |
Synchronous |
| Image/PDF β Video | POST /image-to-screenshots-sse |
SSE progress, OCR + AI + screenshots |
| Resources | GET /list, /history, /cache/stats, DELETE /delete/<type>/<name>, POST /cache/clear |
β |
| Gallery | GET /screenshots/<path> |
Served by Flask |
| ZIP download | POST /download-zip |
Streams a ZIP of selected files |
The full API client is in
frontend/src/api/client.ts and the SSE
state machine in
frontend/src/hooks/useGenerate.ts.
Configuration reference
Key values in backend/config/config.py (see
backend/config/config.example.py for the full list):
| Setting | What it controls |
|---|---|
API_KEY, API_URL, MODEL |
Which LLM the backend talks to (chat completions) |
PORT, HOST |
Flask listen address |
DEFAULT_VIEWPORT_WIDTH/HEIGHT |
Screenshot viewport |
DEFAULT_ZOOM, DEFAULT_OVERLAP |
Capture scaling and slide overlap |
MAX_SCREENSHOTS_LIMIT |
Hard cap on screenshots per run |
POWERPOINT_* |
Windows-only PowerPoint/video export |
VIDEO_* |
Resolution / FPS / quality for PPT β video |
Scripts
Frontend (inside frontend/)
| Command | Description |
|---|---|
npm run dev |
Start Vite dev server with API proxy |
npm run build |
TypeScript + production build to dist/ |
npm run preview |
Preview the production build locally |
npm run lint |
Run ESLint |
Backend (inside backend/)
| Command | Description |
|---|---|
python start.py |
Launch the Flask app with env checks |
python app.py |
Launch the Flask app directly (skip env checks) |
Troubleshooting
Configuration file not foundwhen starting the backend β you didn't copyconfig/config.example.pytoconfig/config.py.- Generation returns 500 /
Failed to get AI responseβ the API key or base URL inconfig.pyis wrong, or the model isn't available from that endpoint. - Screenshots are blank β run
playwright install chromiumagain. /assets/...404 on Option B β rebuild the frontend after code changes (cd frontend && npm run build).- Video export fails on macOS/Linux β the PowerPoint exporter is Windows-only. Screenshots still work on all platforms.
Credits
Based on Screenshot Studio by Educated Nepal. Original stack: Flask + Playwright + Llama 3.1 70B.