File size: 6,356 Bytes
5f3e9f5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
---
title: YT AI Automation
emoji: πŸŽ₯
colorFrom: blue
colorTo: red
sdk: docker
app_port: 7860
pinned: false
---

# TextBro β€” Text β†’ Video Studio

Turn text, raw HTML, images, or PDFs into video-ready screenshots using AI.

- **Backend**: Flask + Playwright (Python) β€” originally
  [Screenshot Studio](https://github.com/shiv12345678901/yt-project).
- **Frontend**: React + Vite + TypeScript + Tailwind CSS.
- **Features**: live SSE progress, cancel, screenshot gallery, ZIP download,
  history, cache inspection. On Windows, the backend can also stitch
  screenshots into a PowerPoint-driven video.

```
Devin_project/
β”œβ”€β”€ backend/          # Flask app, routes, Playwright screenshot engine
β”‚   β”œβ”€β”€ app.py
β”‚   β”œβ”€β”€ start.py
β”‚   β”œβ”€β”€ requirements.txt
β”‚   β”œβ”€β”€ config/
β”‚   β”œβ”€β”€ routes/
β”‚   └── src/
└── frontend/         # React SPA
    β”œβ”€β”€ src/
    β”œβ”€β”€ package.json
    └── vite.config.ts
```

## Requirements

- **Python** 3.10+ (3.11 recommended)
- **Node.js** 20.19+ or 22.13+
- **Playwright's Chromium** (installed via `playwright install chromium`)
- An API key for an OpenAI-compatible LLM endpoint (Groq, Together, OpenAI,
  a local `llama.cpp` server, etc.) β€” the backend uses chat completions.
- **Optional (Windows only)** Microsoft PowerPoint, for the
  screenshot β†’ video pipeline.

## First-time setup

```bash
# 1) Clone
git clone https://github.com/shiv12345678901/Devin_project.git
cd Devin_project
```

### Backend

```bash
cd backend

# (Optional but recommended) create a virtualenv
python -m venv .venv
# Windows:     .venv\Scripts\activate
# macOS/Linux: source .venv/bin/activate

pip install -r requirements.txt
playwright install chromium

# Fill in your API credentials
cp config/config.example.py config/config.py
# Edit config/config.py:
#   API_KEY   = "sk-..."                 # your LLM API key
#   API_URL   = "https://api.groq.com/openai/v1"   # or wherever
#   MODEL     = "llama-3.1-70b-versatile"
```

### Frontend

```bash
cd ../frontend
npm install
```

## Running it

You have two options.

### Option A β€” dev mode (two terminals, hot reload everywhere)

```bash
# Terminal 1
cd backend && python start.py         # http://localhost:5000

# Terminal 2
cd frontend && npm run dev            # http://localhost:5173
```

Open http://localhost:5173 β€” the Vite dev server proxies every API path to
the Flask backend so CORS isn't an issue. Changes to React are hot-reloaded.

### Option B β€” single server (Flask serves the built React app)

```bash
cd frontend && npm run build          # produces frontend/dist/
cd ../backend && python start.py      # http://localhost:5000
```

Now Flask serves the UI and the API from one port, so this is also the
setup you'd use when pointing a tunnel (ngrok, Cloudflare Tunnel) at it.

## What's wired to what

| Frontend page     | Backend endpoint                    | Notes                               |
| ----------------- | ----------------------------------- | ----------------------------------- |
| Text β†’ Video      | `POST /generate-sse`                | SSE progress, cancel via `/cancel/<op>` |
| HTML β†’ Video      | `POST /generate-html`, `/beautify`, `/minify` | Synchronous                 |
| Image/PDF β†’ Video | `POST /image-to-screenshots-sse`    | SSE progress, OCR + AI + screenshots |
| Resources         | `GET /list`, `/history`, `/cache/stats`, `DELETE /delete/<type>/<name>`, `POST /cache/clear` | β€” |
| Gallery           | `GET /screenshots/<path>`           | Served by Flask                     |
| ZIP download      | `POST /download-zip`                | Streams a ZIP of selected files     |

The full API client is in
[`frontend/src/api/client.ts`](frontend/src/api/client.ts) and the SSE
state machine in
[`frontend/src/hooks/useGenerate.ts`](frontend/src/hooks/useGenerate.ts).

## Configuration reference

Key values in `backend/config/config.py` (see
`backend/config/config.example.py` for the full list):

| Setting                        | What it controls                             |
| ------------------------------ | -------------------------------------------- |
| `API_KEY`, `API_URL`, `MODEL`  | Which LLM the backend talks to (chat completions) |
| `PORT`, `HOST`                 | Flask listen address                         |
| `DEFAULT_VIEWPORT_WIDTH/HEIGHT`| Screenshot viewport                          |
| `DEFAULT_ZOOM`, `DEFAULT_OVERLAP` | Capture scaling and slide overlap         |
| `MAX_SCREENSHOTS_LIMIT`        | Hard cap on screenshots per run              |
| `POWERPOINT_*`                 | Windows-only PowerPoint/video export         |
| `VIDEO_*`                      | Resolution / FPS / quality for PPT β†’ video   |

## Scripts

**Frontend** (inside `frontend/`)

| Command           | Description                              |
| ----------------- | ---------------------------------------- |
| `npm run dev`     | Start Vite dev server with API proxy     |
| `npm run build`   | TypeScript + production build to `dist/` |
| `npm run preview` | Preview the production build locally     |
| `npm run lint`    | Run ESLint                               |

**Backend** (inside `backend/`)

| Command               | Description                                          |
| --------------------- | ---------------------------------------------------- |
| `python start.py`     | Launch the Flask app with env checks                 |
| `python app.py`       | Launch the Flask app directly (skip env checks)      |

## Troubleshooting

- **`Configuration file not found`** when starting the backend β€” you didn't
  copy `config/config.example.py` to `config/config.py`.
- **Generation returns 500 / `Failed to get AI response`** β€” the API key or
  base URL in `config.py` is wrong, or the model isn't available from that
  endpoint.
- **Screenshots are blank** β€” run `playwright install chromium` again.
- **`/assets/...` 404 on Option B** β€” rebuild the frontend after code
  changes (`cd frontend && npm run build`).
- **Video export fails on macOS/Linux** β€” the PowerPoint exporter is
  Windows-only. Screenshots still work on all platforms.

## Credits

Based on [Screenshot Studio](https://github.com/shiv12345678901/yt-project)
by Educated Nepal. Original stack: Flask + Playwright + Llama 3.1 70B.