File size: 67,052 Bytes
ed07c96
8dd8a8d
ed07c96
0d70ec5
8dd8a8d
0d70ec5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ed07c96
0d70ec5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ed07c96
0d70ec5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ed07c96
0d70ec5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ed07c96
0d70ec5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ed07c96
0d70ec5
 
ed07c96
0d70ec5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ed07c96
0d70ec5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
# 🧠 PROMPT.md — Build MultiMind Classroom from Scratch

> **MeDo-Styled Prompts** to recreate the full MultiMind Classroom AI Interactive Classroom application.
> One-shot (single mega-prompt) and Multi-shot (phased build) variants included.

---

## 📋 Table of Contents

1. [Project Overview](#-project-overview)
2. [Architecture Blueprint](#-architecture-blueprint)
3. [One-Shot Mega Prompt](#-one-shot-mega-prompt)
4. [Multi-Shot Phased Prompts](#-multi-shot-phased-prompts)
   - Phase 1: Foundation & Scaffold
   - Phase 2: Data Layer & State Management
   - Phase 3: AI Provider System
   - Phase 4: Generation Pipeline
   - Phase 5: Slide Renderer & Canvas
   - Phase 6: Multi-Agent Orchestration
   - Phase 7: Playback Engine & Roundtable
   - Phase 8: Interactive Widgets & PBL
   - Phase 9: Media Generation Pipeline
   - Phase 10: Audio System (TTS/ASR)
   - Phase 11: Chat System & Streaming
   - Phase 12: Settings, i18n, Export, Polish
5. [Key Technical Decisions](#-key-technical-decisions)

---

## 🎯 Project Overview

**MultiMind Classroom** is an open-source AI interactive classroom platform. Users upload a PDF, the system generates an immersive multi-agent learning experience with:

- **AI-generated slide presentations** from PDF content
- **Multi-agent roundtable discussions** (teacher, assistant, student agents)
- **Real-time TTS/ASR** for voice-driven lectures
- **Interactive whiteboard** with collaborative drawing
- **Quiz generation** with auto-grading
- **Interactive widgets** (simulations, diagrams, code editors, 3D visualizations, games)
- **Project-Based Learning (PBL)** mode with MCP tool-calling agents
- **Media generation** (AI images + videos embedded in slides)
- **PowerPoint export** and classroom zip import/export
- **5-language i18n** (zh-CN, en-US, ja-JP, ru-RU, ar-SA)

### Tech Stack

| Layer | Technology |
|-------|-----------|
| Framework | React 19 + Vite 6 |
| Routing | React Router DOM v7 |
| State | Zustand 5 (5 stores: stage, canvas, settings, snapshot, keyboard) |
| Storage | Dexie (IndexedDB) — stages, scenes, audio, images, chat, outlines |
| UI | shadcn/ui (Radix primitives) + Tailwind CSS v4 + Motion (Framer) |
| AI SDK | Vercel AI SDK 6 + LangGraph (multi-agent director graph) |
| AI Providers | OpenAI, Anthropic, Google Gemini, DeepSeek, Qwen, GLM, MiniMax, Ollama, OpenRouter, +10 more |
| TTS/ASR | OpenAI, Azure, GLM, Qwen, MiniMax, Doubao, ElevenLabs, Browser native, VoxCPM |
| Image Gen | Seedream, OpenAI Image, Qwen Image, Nano Banana, MiniMax, Grok |
| Video Gen | Seedance, Kling, Veo, Sora, MiniMax, Grok |
| Rich Text | ProseMirror (custom schema with marks for bold/italic/underline/color/align/indent/lists) |
| Charts | ECharts 6 |
| Diagrams | @xyflow/react (React Flow) |
| Export | pptxgenjs (PowerPoint), JSZip (classroom archives) |
| Math | KaTeX + Temml + mathml2omml (for PPTX export) |
| Code Highlighting | Shiki |
| PDF Parsing | unpdf + MinerU cloud + custom providers |

---

## 🏗 Architecture Blueprint

### Data Flow

```
PDF Upload → Outline Generation (SSE stream) → Scene Content Generation → Action Generation

                                                  Media Generation (parallel)

                                              IndexedDB Storage (Dexie)

                                              Playback Engine (state machine)

                                    Roundtable UI ←→ Chat System ←→ Multi-Agent LangGraph
```

### Store Architecture (Zustand)

| Store | Purpose | Persistence |
|-------|---------|-------------|
| `useStageStore` | Current stage, scenes, outlines, generation status | IndexedDB |
| `useCanvasStore` | Viewport, zoom, selected elements, editing state | Memory |
| `useSettingsStore` | All provider configs, model selection, UI prefs | localStorage |
| `useSnapshotStore` | Undo/redo history | Memory |
| `useKeyboardStore` | Keyboard shortcut state | Memory |
| `useMediaGenerationStore` | Image/video generation tasks and status | IndexedDB |
| `useWhiteboardHistoryStore` | Whiteboard undo/redo per scene | Memory |
| `useWidgetIframeStore` | Widget iframe communication state | Memory |
| `useUserProfileStore` | User nickname, avatar, bio | localStorage |
| `useAgentRegistry` | Agent configs (default + custom + generated) | localStorage |

### Database Schema (Dexie/IndexedDB)

```
stages:        id, name, description, createdAt, updatedAt, languageDirective, style, agentIds
scenes:        id, stageId, type, title, order, content (JSON), actions (JSON), whiteboard (JSON)
audioFiles:    id (audioId), blob, duration, format, text, voice
imageStore:    id (storageId), blob, mimeType
chatSessions:  id, stageId, sceneId, type, status, messages, config
outlines:      id, stageId, outlines (JSON array)
mediaFiles:    id (stageId_elementId), blob, mimeType
generatedAgents: id (stageId), agents (AgentConfig[])
```

### Prompt Template System

File-based with composition:
- `lib/prompts/templates/{promptId}/system.md` + `user.md`
- `lib/prompts/snippets/{name}.md`
- Syntax: `{{variable}}`, `{{snippet:name}}`, `{{#if flag}}...{{/if}}`
- 20+ prompt templates for outlines, slides, quizzes, actions, widgets, PBL, agents

### Action System

Two categories executed by ActionEngine:
- **Fire-and-forget**: `spotlight`, `laser`, `play_video`
- **Synchronous** (wait for completion): `speech`, `wb_open`, `wb_close`, `wb_draw_text`, `wb_draw_shape`, `wb_draw_chart`, `wb_draw_latex`, `wb_draw_table`, `wb_draw_line`, `wb_draw_code`, `wb_edit_code`, `wb_clear`, `wb_delete`, `discussion`
- **Widget actions**: `widget_highlight`, `widget_setState`, `widget_annotation`, `widget_reveal`

### Multi-Agent Orchestration (LangGraph)

```
START → director ──(end)──→ END

           └─(next)→ agent_generate ──→ director (loop)
```

- Director: LLM-based for multi-agent, code-only for single-agent
- Agents: teacher (full slide+whiteboard control), assistant (whiteboard), student (whiteboard, short responses)
- Per-agent: persona prompt, allowed actions, TTS voice, avatar, color

### Playback Engine State Machine

```
         start()                pause()
idle ──────────→ playing ──────────→ paused
  ▲                 ▲                   │
  │                 │  resume()         │
  │                 └───────────────────┘
  │  handleEndDiscussion()
  │                    confirmDiscussion()
  └──────────────── live ──────────→ paused
```

---

## 🚀 ONE-SHOT MEGA PROMPT

> Use this single prompt to generate the entire application in one conversation.

```
Me: Build me "MultiMind Classroom" — an open-source AI interactive classroom platform.

Do: Create a React 19 + Vite 6 + TypeScript application with these EXACT specifications:

### CORE SETUP
- React Router DOM v7 with 4 lazy-loaded routes: / (HomePage), /classroom/:id (ClassroomPage), /generation-preview (GenerationPreviewPage), /eval/whiteboard (WhiteboardEvalPage)
- App.tsx wraps all routes in: BrowserRouter → ThemeProvider → I18nProvider → ServerProvidersInit → AccessCodeGuard → Toaster
- Tailwind CSS v4 with PostCSS, shadcn/ui (Radix-based), oklch color system with light/dark mode via CSS variables
- Path alias @/* → ./src/*

### STATE MANAGEMENT (Zustand 5)
Create 10 Zustand stores:
1. **useStageStore** — stage (name, description, languageDirective, style, agentIds), scenes[], currentSceneId, outlines[], generationStatus, chats[], mode ('autonomous'|'playback'). Actions: loadFromStorage, saveToStorage, addScene, setCurrentSceneId. Debounced IndexedDB persistence.
2. **useCanvasStore** — viewportSize, canvasScale, selectedElementIds, editingElementId, isDrawing, creatingElement, ctrlOrShiftKeyActive. All canvas interaction state.
3. **useSettingsStore** — providerId, modelId, providersConfig (unified JSON), thinkingConfigs, ttsProviderId, ttsVoice, ttsSpeed, asrProviderId, imageProviderId, videoProviderId, pdfProviderId, webSearchProviderId, playbackSpeed, sidebarCollapsed, chatAreaWidth, agentMode ('preset'|'auto'), selectedAgentIds. Persisted via zustand/persist to localStorage. Has fetchServerProviders() to merge server-configured providers.
4. **useSnapshotStore** — history stack for undo/redo
5. **useKeyboardStore** — keyboard shortcut state
6. **useMediaGenerationStore** — tasks map (elementId → {status, objectUrl, error}), enqueueTasks, restoreFromDB
7. **useWhiteboardHistoryStore** — per-scene whiteboard undo/redo
8. **useWidgetIframeStore** — widget iframe postMessage communication
9. **useUserProfileStore** — nickname, avatar, bio. Persisted localStorage.
10. **useAgentRegistry** — agents map (id → AgentConfig), addAgent, updateAgent, deleteAgent. 3 default agents: teacher (AI teacher), assistant (AI助教), student (好奇学生). Each has: id, name, role, persona, avatar, color, allowedActions, priority, voiceConfig, isDefault, isGenerated, boundStageId.

### DATABASE (Dexie/IndexedDB)
Database name 'multimind-db' with tables: stages, scenes, audioFiles, imageStore, chatSessions, outlines, mediaFiles, generatedAgents. Full CRUD operations. Stage storage utilities: listStages, deleteStageData, renameStage, getFirstSlideByStages.

### AI PROVIDER SYSTEM
- Unified provider registry (PROVIDERS) with 15+ providers: openai, anthropic, google, deepseek, qwen, kimi, glm, minimax, siliconflow, doubao, hunyuan, xiaomi, grok, openrouter, ollama
- Each provider: id, name, type ('openai'|'anthropic'|'google'|'openai-compatible'), defaultBaseUrl, requiresApiKey, icon, models[]
- Each model: id, name, contextWindow, outputWindow, capabilities (streaming, tools, vision, thinking)
- createLanguageModel() factory using @ai-sdk/openai, @ai-sdk/anthropic, @ai-sdk/google
- Thinking config system: toggleable, budgetAdjustable, defaultEnabled per model
- callLLM() and streamLLM() wrappers with thinking support

### GENERATION PIPELINE (Two-Stage)
**Stage 1 — Outline Generation:**
- POST /api/generate/scene-outlines-stream → SSE stream
- Input: PDF text + images + user requirements + agents + language
- Output: SceneOutline[] with type ('slide'|'quiz'|'interactive'|'pbl'), title, notes, order, mediaGenerations[]
- Incremental JSON parsing from LLM stream
- Generation Preview page with step-by-step visualization (outline streaming → agent profile generation → scene content → navigate to classroom)

**Stage 2 — Scene Generation (per outline):**
- generateSceneContent() → POST /api/generate/scene-content — generates slides/quiz/interactive/pbl content
- generateSceneActions() → POST /api/generate/scene-actions — generates teacher speech + visual actions for each scene
- createSceneWithActions() — assembles Scene object with elements, actions, whiteboard data
- Interactive post-processor: sanitizes HTML widgets, injects CSS isolation
- Action parser: extracts structured [{type, name, params}, {type: "text", content}] from LLM output

### SLIDE TYPE SYSTEM
Scene types with specific content structures:
- **slide**: PPTElement[] (text, image, shape, line, chart, table, latex, video, audio, code elements). Each element has: id, type, left, top, width, height, rotate, opacity, shadow, outline, fill, link, groupId, lock, name.
- **quiz**: QuizQuestion[] with type (single-choice, multiple-choice, fill-in-blank, true-false, short-answer), question text, options, correctAnswer, explanation
- **interactive**: WidgetConfig with type (simulation, diagram, code, game, visualization3d), HTML/iframe content, teacherActions[]
- **pbl**: PBLProjectConfig with roles, issues, workspaces

### SLIDE RENDERER
Full PowerPoint-compatible renderer in React:
- Editor/Canvas with viewport scaling, drag/drop, resize handles, rotation handles, alignment lines, grid, ruler
- Element types: TextElement (ProseMirror), ImageElement (clip masks, filters), ShapeElement (SVG paths, gradients, patterns), LineElement (cubic bezier, markers), ChartElement (ECharts), TableElement, LatexElement (KaTeX), VideoElement, CodeElement (Shiki)
- ThumbnailSlide for sidebar previews
- ScreenElement for presentation mode
- useScaleElement, useDragElement, useRotateElement, useSelectElement hooks

### MULTI-AGENT ORCHESTRATION (LangGraph)
- StateGraph with OrchestratorState annotation
- Director node: multi-agent LLM decision or single-agent code-only
- agent_generate node: builds structured prompt per agent, streams response with tool calls
- statelessGenerate(): single-pass generation from messages + storeState
- Prompt builder: role guidelines per agent type, whiteboard ledger context, peer context, state context
- SSE streaming: StatelessEvent chunks with {type: 'text'|'tool_call'|'done'|'error', agentId, content}

### PLAYBACK ENGINE
Class-based PlaybackEngine with state machine (idle → playing → paused, idle → live → paused):
- Consumes Scene.actions[] sequentially via ActionEngine
- ActionEngine: processes spotlight (highlight element), laser (pointer effect), speech (TTS), whiteboard actions, discussion triggers
- Speech TTS: fetches audio from /api/generate/tts, plays via AudioPlayer, shows speech overlay
- Discussion triggers: pause playback, switch to live mode, enable chat input
- Auto-resume generation for pending outlines on classroom load
- Speed control: 1x, 1.25x, 1.5x, 2x

### ROUNDTABLE UI (95KB component)
Main classroom interaction panel with:
- Voice waveform animation during speech
- Agent avatars with speaking indicator
- Chat input with voice recording (ASR)
- Proactive discussion cards
- Slide navigation controls
- Presentation mode (fullscreen)
- Whiteboard toggle
- Playback progress bar
- Speed selector
- Thinking state indicator

### CHAT SYSTEM
- ChatSession: id, type (qa|discussion|lecture), status, messages (UIMessage[]), config (agentIds, maxTurns, triggerAgentId)
- useChatSessions hook: manages sessions, sends messages via POST /api/chat SSE, handles interruption, persists to IndexedDB
- StreamBuffer: buffers SSE text chunks, reveals text word-by-word for natural speech feel
- Message metadata: senderName, senderAvatar, agentId, agentColor, actions (spotlight/highlight/insert)
- Chat area with session list, message bubbles, inline action tags, lecture notes view

### WHITEBOARD SYSTEM
- Collaborative whiteboard overlay on slides
- Elements: text, shapes (rect/circle/triangle), charts, LaTeX, tables, lines, code blocks
- ActionEngine executes wb_draw_*, wb_delete, wb_clear, wb_open, wb_close
- WhiteboardCanvas with drawing tools
- WhiteboardHistory for undo/redo
- Whiteboard conflicts summarizer for multi-agent coordination

### INTERACTIVE WIDGETS
5 widget types, each generates self-contained HTML rendered in sandboxed iframe:
1. **Simulation**: variable sliders, physics/math simulations, presets
2. **Diagram**: React Flow nodes/edges, decision trees, flowcharts
3. **Code**: executable code editor with output panel
4. **Game**: educational games (quizzes, puzzles)
5. **Visualization3D**: Three.js/WebGL 3D models
- Widget teacher actions: highlight, setState, annotation, reveal
- postMessage bridge for iframe ↔ parent communication

### PBL (Project-Based Learning)
- Agentic loop using Vercel AI SDK generateText + stopWhen
- MCP tools: ModeMCP, ProjectMCP, AgentMCP, IssueboardMCP
- Generates: project config, roles, issues, workspaces
- PBL renderer with: role selection, chat panel, issue board, workspace, guide

### MEDIA GENERATION
- Media orchestrator dispatches parallel API calls
- Image providers: Seedream (ByteDance), OpenAI Image, Qwen Image, Nano Banana, MiniMax, Grok
- Video providers: Seedance (ByteDance), Kling (Kuaishou), Veo (Google), MiniMax, Grok
- Async task pattern: submit → poll → download blob → IndexedDB
- MediaGenerationStore tracks task status per elementId

### AUDIO SYSTEM
- TTS providers: OpenAI, Azure, GLM, Qwen, MiniMax, Doubao, ElevenLabs, VoxCPM, Browser native
- ASR providers: OpenAI Whisper, Qwen ASR, Browser native
- Voice resolver: maps agent voices across providers
- AudioPlayer: Web Audio API playback with speed control
- useAudioRecorder: MediaRecorder API for voice input
- useBrowserTTS/useDiscussionTTS: manages TTS lifecycle during discussions

### EXPORT SYSTEM
- PowerPoint export via pptxgenjs: converts PPTElement[] to PPTX with shapes, images, charts, tables, LaTeX→OMML
- Classroom ZIP export/import: stages + scenes + audio + images + agents as portable archive
- HTML parser for slide text → PPTX rich text conversion
- SVG path parser for shape export
- LaTeX → OMML converter via mathml2omml

### I18N SYSTEM
- i18next + react-i18next + resources-to-backend
- 5 locales: zh-CN, en-US, ja-JP, ru-RU, ar-SA
- Dynamic import: `import(\`./locales/${language}.json\`)`
- useI18n hook with locale detection from localStorage/navigator
- Language switcher component

### API ROUTES (27 endpoints)
All under /api/:
- /chat — SSE chat stream (multi-agent)
- /generate/scene-outlines-stream — SSE outline generation
- /generate/scene-content — scene content generation
- /generate/scene-actions — scene action generation
- /generate/agent-profiles — agent profile generation
- /generate/image — image generation
- /generate/video — video generation
- /generate/tts — single TTS audio generation
- /parse-pdf — PDF parsing
- /classroom — CRUD for server-stored classrooms
- /classroom-media/[classroomId]/[...path] — media file serving
- /generate-classroom — background classroom generation job
- /generate-classroom/[jobId] — job status polling
- /quiz-grade — LLM-based quiz grading
- /pbl/chat — PBL runtime chat
- /web-search — Tavily web search
- /proxy-media — CORS proxy for remote media
- /server-providers — server-configured provider list
- /verify-model, /verify-image-provider, /verify-video-provider, /verify-pdf-provider — credential verification
- /azure-voices — Azure TTS voice list
- /transcription — audio transcription
- /access-code/status, /access-code/verify — access code authentication
- /health — health check

### SECURITY
- Access code guard (HMAC-signed cookie)
- SSRF guard for server-side URL fetching
- Content Security Policy headers
- Input validation on all API routes

### TESTING
- Vitest for unit tests (29 test files)
- Playwright for E2E tests (4 test suites)
- Evaluation framework for whiteboard layout scoring and outline language detection

Build the complete application with all 950+ source files, full type safety, and production-ready error handling.
```

---

## 🔄 MULTI-SHOT PHASED PROMPTS

### Phase 1: Foundation & Scaffold

```
Me: Start building "MultiMind Classroom" — an AI interactive classroom. Set up the project foundation.

Do:
1. Initialize React 19 + Vite 6 + TypeScript project
2. Configure: Tailwind CSS v4 with PostCSS, path alias @/* → ./src/*, oklch color system
3. Install core deps: react-router-dom, zustand, dexie, lucide-react, motion, sonner, clsx, tailwind-merge, class-variance-authority, nanoid, zod
4. Install shadcn/ui components: button, dialog, dropdown-menu, popover, tooltip, tabs, input, textarea, select, checkbox, switch, slider, scroll-area, command, alert-dialog, card, badge, carousel, separator, progress, label, hover-card, context-menu, collapsible, avatar, alert, field, input-group, button-group, combobox
5. Create App.tsx with BrowserRouter wrapping: ThemeProvider → I18nProvider → AccessCodeGuard → lazy routes (/, /classroom/:id, /generation-preview, /eval/whiteboard) → Toaster
6. Create ThemeProvider (light/dark/system, localStorage persist, document.documentElement.classList toggle)
7. Create I18nProvider with i18next + react-i18next + resources-to-backend, 5 locales (zh-CN, en-US, ja-JP, ru-RU, ar-SA), dynamic JSON imports, localStorage locale persistence
8. Create globals.css with full oklch color system (:root + .dark), CSS custom properties for all shadcn tokens, Tailwind @theme inline block, ProseMirror styles, animation keyframes (wave, shimmer, breathing-bar, interactive-mode-breathe)
9. Create createLogger utility (timestamp + level + tag formatting)
10. Verify: `vite build` succeeds with 0 errors
```

### Phase 2: Data Layer & State Management

```
Me: Build the data layer and state management for MultiMind Classroom.

Do:
1. Create Dexie database 'multimind-db' with tables: stages (id, name, description, createdAt, updatedAt, languageDirective, style, currentSceneId, agentIds, interactiveMode), scenes (id, stageId, type, title, order, content, actions, whiteboard), audioFiles (id, blob, duration, format, text, voice), imageStore (id, blob, mimeType), chatSessions (id, stageId, sceneId, type, status, messages, config), outlines (id, stageId, outlines), mediaFiles (id, blob, mimeType), generatedAgents (id, agents)
2. Create stage-storage utilities: listStages() → StageListItem[], deleteStageData(), renameStage(), getFirstSlideByStages() → Record<string, Slide>
3. Create image-storage utilities: storePdfBlob(), loadPdfBlob(), storeImages(), loadImageMapping(), cleanupOldImages()
4. Create useStageStore (Zustand): stage, scenes[], currentSceneId, outlines[], chats[], mode, generationStatus, generationEpoch, failedOutlines[], toolbarState. Actions: setStage, addScene, updateScene, deleteScene, setCurrentSceneId, loadFromStorage (IndexedDB → state), saveToStorage (debounced state → IndexedDB), getCurrentScene()
5. Create useCanvasStore: viewportSize, canvasScale, selectedElementIds[], editingElementId, isDrawing, creatingElement, ctrlOrShiftKeyActive, showGridLines, showRuler, snapToGrid
6. Create useSettingsStore with zustand/persist: providerId, modelId, thinkingConfigs, providersConfig, ttsProviderId/Voice/Speed, asrProviderId/Language, imageProviderId, videoProviderId, pdfProviderId, webSearchProviderId, playbackSpeed, sidebarCollapsed, chatAreaWidth, chatAreaCollapsed, agentMode, selectedAgentIds, fetchServerProviders(), all setters. Validate provider/model on rehydration.
7. Create useSnapshotStore, useKeyboardStore, useMediaGenerationStore, useWhiteboardHistoryStore, useWidgetIframeStore, useUserProfileStore
8. Create useAgentRegistry with zustand/persist: agents map, 3 default agents (teacher: "AI teacher" with full slide+whiteboard actions priority 10, assistant: "AI助教" with whiteboard-only priority 5, student: "好奇学生" with whiteboard-only priority 3). Each agent: id, name, role, persona (detailed teaching style), avatar, color, allowedActions, priority, voiceConfig, isDefault
9. Define all TypeScript types in lib/types/: slides.ts (PPTElement union with 10 element types, Slide, SlideTheme, SlideBackground), action.ts (20+ action types), stage.ts (Stage, Scene, SceneType, StageMode), chat.ts (ChatSession, StatelessChatRequest/Event), generation.ts (SceneOutline, UserRequirements, PdfImage), provider.ts (ProviderId, ProviderConfig, ModelInfo, ThinkingConfig), widgets.ts (5 widget configs), settings.ts, roundtable.ts, web-search.ts, pdf.ts, edit.ts, export.ts
```

### Phase 3: AI Provider System

```
Me: Build the AI provider system with 15+ LLM providers.

Do:
1. Create PROVIDERS registry with full configs for: openai (gpt-4o, gpt-5.5, o3-mini, o4-mini), anthropic (claude-4-sonnet, claude-3.7-sonnet), google (gemini-2.5-pro, gemini-2.5-flash), deepseek (deepseek-chat, deepseek-reasoner), qwen (qwen3-235b, qwen-max, qwen-plus), kimi (moonshot-v1-auto), glm (glm-4-plus, glm-z1-air), minimax (MiniMax-M1), siliconflow (meta-llama, Qwen, DeepSeek), doubao (doubao-pro, doubao-1.5-pro), hunyuan, xiaomi (MiMo-7B), grok (grok-3), openrouter (pass-through), ollama (local models)
2. Each provider: type ('openai'|'anthropic'|'google'|'openai-compatible'), defaultBaseUrl, requiresApiKey, icon path, models[] with contextWindow, outputWindow, capabilities (streaming, tools, vision, thinking config)
3. createLanguageModel(config) factory: routes to @ai-sdk/openai, @ai-sdk/anthropic, @ai-sdk/google based on provider type. OpenAI-compatible providers use createOpenAI with custom baseURL.
4. Thinking config system: ThinkingConfig = {mode: 'disabled'|'auto'|'manual', enabled, budget?}. Per-model thinking capability (toggleable, budgetAdjustable, defaultEnabled). getThinkingMode() and pickThinkingBudget() utilities.
5. callLLM(model, options) — single-shot generation with thinking support. streamLLM(model, options) — streaming with thinking.
6. Model metadata: applyModelMetadata() enriches model configs with catalog data. getCatalogThinkingCapability() returns thinking support level.
7. Server-side resolveModel() — resolves model string + API key + base URL into LanguageModel instance, handles server-configured providers from env vars.
```

### Phase 4: Generation Pipeline

```
Me: Build the two-stage generation pipeline for creating classroom content from PDF.

Do:
1. Create prompt template system: lib/prompts/ with loader.ts (loadPrompt, buildPrompt, interpolateVariables, processSnippets, processConditionalBlocks), types.ts (PromptId, SnippetId). Templates in templates/{promptId}/system.md + user.md. Snippets in snippets/*.md. Syntax: {{variable}}, {{snippet:name}}, {{#if flag}}...{{/if}}.
2. Create 20+ prompt templates: requirements-to-outlines, interactive-outlines, slide-content, quiz-content, slide-actions, quiz-actions, interactive-actions, simulation-content, diagram-content, code-content, game-content, visualization3d-content, widget-teacher-actions, pbl-actions, pbl-design, agent-system (4 variants: base, wb-teacher, wb-assistant, wb-student), director, web-search-query-rewrite
3. Stage 1 — Outline Generator: generateSceneOutlinesFromRequirements() builds prompt from PDF content + user requirements + agent info + language directive. SSE API endpoint streams outlines as incremental JSON objects. Frontend GenerationPreview page shows step visualization.
4. Stage 2 — Scene Generator: generateSceneContent(outline, context, model) dispatches to slide/quiz/interactive/PBL generators based on outline.type. generateSceneActions(content, outline, context, model) generates teacher speech + visual action sequences. createSceneWithActions() assembles final Scene.
5. Scene builder: buildSceneFromOutline() converts generated content to PPTElement[]. uniquifyMediaElementIds() ensures globally unique IDs for media placeholders.
6. Interactive post-processor: sanitizes HTML, injects CSS isolation, wraps in responsive container.
7. Action parser: parseActionsFromStructuredOutput() extracts [{type:"action", name, params}, {type:"text", content}] from LLM JSON output.
8. JSON repair: parseJsonResponse() handles malformed LLM JSON with bracket balancing, markdown fence stripping, partial parse recovery.
9. Pipeline runner: createGenerationSession() + runGenerationPipeline() orchestrates the full flow with callbacks.
10. API routes: /api/generate/scene-outlines-stream (SSE), /api/generate/scene-content (POST), /api/generate/scene-actions (POST), /api/generate/agent-profiles (POST)
```

### Phase 5: Slide Renderer & Canvas

```
Me: Build the full slide renderer with PowerPoint-compatible elements and interactive canvas.

Do:
1. Create Editor/Canvas with: viewport scaling (useViewportSize), drag-to-select (useMouseSelection), element selection (useSelectElement), element dragging (useDragElement), element scaling (useScaleElement with 8 resize handles), element rotation (useRotateElement), alignment lines (AlignmentLine), grid lines (GridLines), ruler (Ruler), drop support (useDrop)
2. Create 10 element renderers:
   - TextElement: ProseMirror editor with custom schema (paragraph, heading, bulletList, orderedList, hardBreak, marks: bold, italic, underline, strikethrough, color, backgroundColor, fontSize, fontFamily, textAlign, textIndent, lineHeight, superscript, subscript, link)
   - ImageElement: clip paths (rect, ellipse, polygon), filters, flip, shadow, outline
   - ShapeElement: 20+ SVG path formulas (roundRect, triangle, parallelogram, trapezoid, etc), gradient fills (linear/radial), pattern fills
   - LineElement: cubic bezier curves, arrow markers, point dragging (useDragLineElement)
   - ChartElement: ECharts integration (bar, line, pie, scatter, radar, area)
   - TableElement: cell editing, merge, border styling
   - LatexElement: KaTeX rendering with Temml fallback
   - VideoElement: HTML5 video with poster, autoplay control
   - CodeElement: Shiki syntax highlighting with 50+ language grammars
   - AudioElement: audio player UI
3. Create operate overlays: CommonElementOperate, ImageElementOperate, ShapeElementOperate (keypoint drag for path shapes), LineElementOperate (endpoint drag), TableElementOperate, TextElementOperate, MultiSelectOperate
4. Create ThumbnailSlide for sidebar scene list (scaled-down readonly render)
5. Create ThumbnailInteractive for interactive widget previews
6. Create ScreenElement and ScreenCanvas for presentation mode
7. Create ViewportBackground (slide background with solid/gradient/image fills)
8. Create element hooks: useElementFill, useElementFlip, useElementOutline, useElementShadow
9. Create canvas operations hook: useCanvasOperations with element CRUD, alignment, distribution, z-order, grouping
```

### Phase 6: Multi-Agent Orchestration

```
Me: Build the LangGraph-based multi-agent orchestration system.

Do:
1. Create OrchestratorState (LangGraph Annotation.Root): messages, storeState, availableAgentIds, maxTurns, languageModel, thinkingConfig, discussionContext, triggerAgentId, userProfile, agentConfigOverrides, turnSummaries[], whiteboardActions[], nextAgentId, isComplete, generatedChunks
2. Create director node: LLM-based multi-agent decision (who speaks next, what to do). Code fast-paths for turn 0 (trigger agent) and turn limits. Single-agent mode: pure code logic, no LLM call.
3. Create agent_generate node: resolves AgentConfig, builds structured prompt via buildStructuredPrompt(), streams LLM response, parses structured chunks [{type, name/content}], emits StatelessEvent via config.writer()
4. Create StateGraph: START → director → (end→END | next→agent_generate→director loop)
5. Create buildStructuredPrompt(): combines role guidelines, persona, state context (current slide elements), whiteboard ledger (spatial layout of all whiteboard elements), peer context (other agents' recent actions), available action descriptions, format examples
6. Create summarizers: conversation-summary (compress old messages), message-converter (UIMessage → OpenAI format), state-context (current slide description), whiteboard-ledger (virtual whiteboard spatial state), whiteboard-conflicts (detect conflicting draws), peer-context (recent agent actions)
7. Create director-prompt: buildDirectorPrompt() with agent profiles, conversation history, available tools. parseDirectorDecision() extracts {nextAgentId, reason, isComplete}
8. Create tool-schemas: getEffectiveActions(role) returns allowed action schemas. getActionDescriptions() generates human-readable action docs.
9. Create AISdkLangGraphAdapter: bridges Vercel AI SDK LanguageModel to LangGraph's BaseChatModel interface
10. Create statelessGenerate(): entry point called by /api/chat, invokes graph.stream(), yields StatelessEvent SSE chunks
```

### Phase 7: Playback Engine & Roundtable

```
Me: Build the PlaybackEngine state machine and Roundtable UI.

Do:
1. Create PlaybackEngine class: state machine (idle/playing/paused/live), consumes Scene.actions[] via ActionEngine, manages scene transitions, handles discussion triggers, speed control (1x/1.25x/1.5x/2x)
2. Create ActionEngine: processes action queue — spotlight (dim other elements, highlight target), laser (red pointer effect), speech (fetch TTS → AudioPlayer → wait for completion), wb_open/close (toggle whiteboard overlay), wb_draw_* (add elements to whiteboard), wb_delete/clear, discussion (pause playback, switch to live mode), play_video, widget actions
3. Create AudioPlayer: Web Audio API wrapper with play/pause/stop, speed adjustment, volume control, onEnd callback
4. Create PlaybackEngine callbacks: onModeChange, onSceneChange, onActionStart/End, onSpeechStart/End, onDiscussionTrigger, onComplete
5. Create computePlaybackView() — derives presentation state from engine: currentSpeech, speakingAgentId, audioState, progress
6. Create Stage component (main container): integrates SceneSidebar + CanvasArea + Roundtable + ChatArea. Manages PlaybackEngine lifecycle, discussion flow (trigger → live chat → end → resume), TTS during discussions via useDiscussionTTS
7. Create Roundtable component: voice waveform bars, agent avatar ring with speaking indicator, chat input with send button + voice recording, proactive discussion cards, slide navigation (prev/next), playback controls (play/pause/speed), presentation mode toggle, whiteboard toggle, thinking state display, end flash animation
8. Create PresentationSpeechOverlay: full-screen speech text display during presentation mode
9. Create SceneSidebar: scene thumbnail list with drag-to-reorder, generation progress indicators, failed outline retry, home navigation
10. Create Header: back button, settings gear, theme switcher, language switcher, export dropdown (PPTX, classroom ZIP)
```

### Phase 8: Interactive Widgets & PBL

```
Me: Build the interactive widget system and PBL mode.

Do:
1. Create 5 widget content generators (each calls LLM with specialized prompts):
   - simulation-content: generates HTML with variable sliders, canvas/SVG visualization, physics formulas
   - diagram-content: generates React Flow JSON (nodes, edges, layout)
   - code-content: generates executable code with output panel, language selector
   - game-content: generates HTML5 game with scoring, levels, educational goals
   - visualization3d-content: generates Three.js scene with camera controls, annotations
2. Create InteractiveRenderer: sandboxed iframe loading widget HTML, postMessage bridge for teacher actions (highlight, setState, annotation, reveal)
3. Create widget teacher action generation: widget-teacher-actions prompt generates action sequence for teacher to guide students through widget
4. Create useWidgetIframeStore: register/unregister iframes, send setState/highlight/annotation/reveal messages
5. Create PBL generation system:
   - generatePBLContent() using Vercel AI SDK generateText with tools and stepCountIs stopWhen
   - MCP tools: ModeMCP (set PBL mode), ProjectMCP (set project config), AgentMCP (create agent roles), IssueboardMCP (create issues with acceptance criteria)
   - buildPBLSystemPrompt() with project topic, skills, language directive
6. Create PBL renderer components:
   - PBLRenderer: main container with role selection → workspace
   - RoleSelection: choose student role from generated options
   - ChatPanel: per-role chat with @mention routing to agents
   - IssueboardPanel: kanban-style issue tracking
   - Workspace: collaborative workspace area
   - Guide: step-by-step project guide
7. Create /api/pbl/chat endpoint: handles @mention routing, generates agent responses per role
```

### Phase 9: Media Generation Pipeline

```
Me: Build the media generation pipeline for AI images and videos.

Do:
1. Create MediaGenerationStore: tasks Map<elementId, {status, objectUrl, blob, error}>, enqueueTasks(), completeTask(), failTask(), restoreFromDB(), revokeObjectUrls()
2. Create media orchestrator: generateMediaForOutlines() collects all media requests from outlines[].mediaGenerations, filters by enabled providers, processes serially (API concurrency limits)
3. Create image provider adapters:
   - Seedream (ByteDance): POST to ark.cn-beijing.volces.com with HMAC auth
   - OpenAI Image: POST to /v1/images/generations
   - Qwen Image: POST to dashscope with async task pattern
   - Nano Banana: POST with banana.dev API
   - MiniMax Image: POST to api.minimax.chat
   - Grok Image: POST to api.x.ai
4. Create video provider adapters (all async task pattern: submit → poll → download):
   - Seedance (ByteDance): HMAC-signed requests, JWT token for kling
   - Kling (Kuaishou): JWT auth, task polling
   - Veo (Google DeepMind): OAuth, long-running operations
   - MiniMax Video, Grok Video
5. Each adapter: generate(config, options) → {url, blob}, testConnectivity(config) → boolean
6. Create /api/generate/image and /api/generate/video endpoints
7. Create /api/proxy-media endpoint for CORS proxy of remote media URLs
8. Create /api/verify-image-provider and /api/verify-video-provider for credential testing
9. Create MediaPopover UI: shows generation progress per media element, retry failed, preview generated media
```

### Phase 10: Audio System (TTS/ASR)

```
Me: Build the TTS and ASR audio system with 8+ providers.

Do:
1. Create TTS provider registry with configs: openai-tts (alloy/echo/fable/onyx/nova/shimmer), azure-tts (500+ voices from azure.json), glm-tts, qwen-tts (sambert voices), minimax-tts (3 models), doubao-tts, elevenlabs-tts, voxcpm (custom voice cloning), browser-tts (Web Speech API)
2. Each TTS provider: id, name, requiresApiKey, defaultBaseUrl, icon, voices[], supportedFormats, speedRange
3. Create generateTTS(config, text) router: dispatches to provider-specific functions, returns {audio: Uint8Array, format: string}
4. Create ASR provider registry: openai-whisper, qwen-asr, browser-asr
5. Create transcribeAudio(config, audioBlob) router
6. Create voice resolver: getAvailableProvidersWithVoices(), maps agent voiceConfig to provider+voice
7. Create VoxCPM integration: custom voice profiles, VLLM model support, voice cloning
8. Create /api/generate/tts endpoint (single TTS generation)
9. Create /api/transcription endpoint
10. Create /api/azure-voices endpoint (Azure voice list)
11. Create useAudioRecorder hook: MediaRecorder API, audio visualization, silence detection
12. Create useBrowserTTS hook: Web Speech API fallback
13. Create useDiscussionTTS hook: manages TTS lifecycle during live discussions, queues speech, handles interruption
14. Create useTTSPreview hook: preview voice in settings
15. Create SpeechButton component: toggle voice recording with waveform
16. Create TTSConfigPopover: voice selector, speed slider, provider selector
```

### Phase 11: Chat System & Streaming

```
Me: Build the chat system with SSE streaming and session management.

Do:
1. Create StatelessChatRequest type: messages (UIMessage[]), storeState ({stage, scenes, currentSceneId, mode}), config ({agentIds, sessionType, maxTurns, triggerAgentId}), model, apiKey, baseUrl, providerType, userProfile, agentConfigs
2. Create StatelessEvent type: {type: 'text'|'tool_call'|'error'|'done', agentId?, content?, toolName?, args?}
3. Create /api/chat POST endpoint: validates request, resolves model, invokes statelessGenerate(), streams SSE events via ReadableStream + TextEncoder
4. Create useChatSessions hook (53KB): manages multiple ChatSession instances per scene, sendMessage() → fetch SSE → parse events → update messages, handleInterrupt() → abort controller, auto-create QA session on first user message, create discussion session from proactive card, persist sessions to IndexedDB, restore on load
5. Create StreamBuffer: accumulates text chunks from SSE, reveals words incrementally for natural TTS sync, tracks reveal progress (0-1) for auto-scroll
6. Create ChatArea component: session tab list, message list with agent avatars + colors, inline action tags (spotlight/highlight buttons in messages), lecture notes view (extracted from speech actions), typing indicator
7. Create ChatSession component: handles individual session rendering, message input, send/interrupt buttons
8. Create ProactiveCard component: discussion invitation cards with topic, prompt, accept/skip buttons, animation
9. Create InlineActionTag component: clickable action buttons within messages (triggers spotlight/insert on slide)
10. Create LectureNotesView: extracts and displays all speech text from actions as structured notes
```

### Phase 12: Settings, i18n, Export, Polish

```
Me: Build settings, internationalization, export system, and polish everything.

Do:
1. Create SettingsDialog with tabbed sections: General (theme, language, access code), Model (provider selector, model selector with search, API key input, base URL, thinking config toggle), Audio (TTS provider/voice/speed, ASR provider, per-agent voice assignment), Image (provider, model, API key, aspect ratio), Video (provider, model, API key), PDF (provider, API key), Web Search (provider, API key), Agent (agent list with add/edit/delete, persona editor, action permissions, priority)
2. Create ModelSelector: searchable dropdown with provider grouping, model capabilities badges (vision, tools, thinking), context window display
3. Create AddProviderDialog: custom provider registration with name, base URL, API key, models
4. Create ProviderConfigPanel: provider-specific settings form
5. Create i18n translation files for all 5 locales (1500+ translation keys each): home, classroom, settings, generation, chat, quiz, whiteboard, export, agents, audio, errors, common
6. Create LanguageSwitcher component: dropdown with locale labels + short codes
7. Create PowerPoint export: useExportPPTX hook converts scenes to PPTX via pptxgenjs. Handles: text with rich formatting, images with clip paths, shapes with SVG paths, charts (ECharts → static image), tables, LaTeX (KaTeX → MathML → OMML), videos (poster image), code blocks (syntax-highlighted HTML)
8. Create classroom ZIP export/import: useExportClassroom hook creates ZIP with manifest.json + scenes + audio + images + agents. useImportClassroom hook parses ZIP and restores to IndexedDB.
9. Create HTML parser for PPTX: lexer → parser → format → stringify pipeline for converting ProseMirror HTML to PPTX rich text runs
10. Create LaTeX to OMML converter chain: KaTeX → MathML → OMML (via mathml2omml package) for PowerPoint math equations
11. Create AccessCodeGuard + AccessCodeModal: HMAC-signed token verification, cookie persistence
12. Create ServerProvidersInit: fetches /api/server-providers on mount, merges into settings store
13. Create UserProfile component: expandable pill with avatar picker (12 built-in + custom upload), nickname editor, bio textarea
14. Create GeneratingProgress component: step-by-step progress during classroom generation
15. Create OutlinesEditor: edit generated outlines before scene generation (reorder, delete, rename)
16. Add all CSS animations, transitions, hover states, dark mode variants
17. Verify: full build succeeds, all routes render, settings persist, i18n switches correctly
```

---

## 🔧 Key Technical Decisions

| Decision | Rationale |
|----------|-----------|
| **Zustand over Redux** | Simpler API, better TypeScript support, no boilerplate, built-in persist middleware |
| **Dexie over raw IndexedDB** | Type-safe queries, promise-based API, versioned migrations, compound indexes |
| **Vercel AI SDK** | Unified streaming interface across 15+ providers, built-in tool calling, thinking support |
| **LangGraph for orchestration** | Stateful graph execution, conditional routing, streaming writer API, battle-tested |
| **ProseMirror over Slate/TipTap** | Lower-level control needed for PowerPoint-compatible rich text, custom schema/marks |
| **Vite over Next.js (conversion)** | Client-only SPA, no SSR needed, faster builds, simpler deployment |
| **File-based prompts** | Version-controllable, composable via snippets, conditional blocks for feature flags |
| **ActionEngine pattern** | Unified sync/async action execution, same types for live streaming and playback |
| **IndexedDB for everything** | Offline-first, large blob storage (audio/images), no server dependency for user data |
| **iframe sandbox for widgets** | Security isolation for LLM-generated HTML/JS, postMessage for controlled communication |

---

## 🖼 Frontend — Full UI Build Prompt

### F1: Page Layouts & Routing

**4 pages, all lazy-loaded via `React.lazy()` + `<Suspense>`:**

| Route | Component | Layout |
|-------|-----------|--------|
| `/` | `HomePage` (890 lines) | Full-screen gradient bg (`from-slate-50 to-slate-100`). Top-right floating toolbar pill (glass morphism: bg-white/60 backdrop-blur-md border rounded-full) with 3-way theme toggle (Sun/Moon/Monitor icons cycling), LanguageSwitcher dropdown, Settings gear button. Center: animated logo + tagline. Main card: requirement textarea (auto-resize) + PDF upload button + InteractiveMode toggle (Atom icon with breathing glow) + GenerationToolbar below. Below main card: collapsible "Recent Classrooms" grid (responsive 1→2→3→4 cols) with ThumbnailSlide previews, date badges, rename/delete with inline confirmation overlay. User profile expandable pill (top-left). Classroom import via hidden file input. Full search with real-time filtering. |
| `/generation-preview` | `GenerationPreviewPage` (900 lines) | Centered card with vertical step progress. Steps: PDF Analysis (scanning laser animation), Web Search (globe + card stack), Outline Generation (streaming outline cards with type icons), Agent Generation (reveals agents with `AgentRevealModal`), Scene Content (slide assembly animation), Actions (action sequence visualization). Error state with retry. Auto-navigates to `/classroom/:id` on completion. |
| `/classroom/:id` | `ClassroomPage` (180 lines) | Loads classroom from IndexedDB (or server fallback), restores agents, auto-resumes pending generation. Full-screen flex layout: `SceneSidebar` (left, collapsible) + center column (`Header` + `CanvasArea` with `Whiteboard` overlay) + `ChatArea` (right, resizable width, collapsible) + `Roundtable` (bottom overlay). |
| `/eval/whiteboard` | `WhiteboardEvalPage` (60 lines) | Debug tool. Bootstraps synthetic stage/scene, renders `ScreenElement` for whiteboard layout evaluation. |

### F2: Component Hierarchy (203 files)

**Top-10 largest components by complexity:**

| Component | Lines | Responsibility |
|-----------|-------|---------------|
| `Roundtable` | 2094 | Main classroom interaction: 12-bar voice waveform, agent avatar ring with speaking indicator, chat input with Send/Mic toggle, ProactiveCard for discussion invitations, slide nav, playback controls (Play/Pause + speed 1×→2×), presentation mode, whiteboard toggle, volume slider with mute, thinking state, end-of-discussion flash, PresentationSpeechOverlay |
| `useChatSessions` | 1525 | Hook managing all chat state: session CRUD, SSE streaming (fetch → ReadableStream → TextDecoder → JSON.parse per line), abort controller for interruption, tool call execution, IndexedDB persistence |
| `Stage` | 1271 | Master orchestrator: creates PlaybackEngine, wires all callbacks, manages ActionEngine + AudioPlayer lifecycles, computes playbackView, handles discussion flow, connects useDiscussionTTS |
| `PromptInput` | 1267 | Rich text input with @mention support, file attachments, voice recording integration, model selector inline |
| `TTSSettings` | 1264 | Full TTS configuration: provider list with enable/disable toggles, voice preview with play button, speed slider, custom model CRUD, VoxCPM configuration |
| `SettingsDialog` | 1143 | Tabbed dialog (10 tabs): providers, image, video, tts, asr, pdf, web-search, general, agents. Left sidebar navigation with icons. Provider list column pattern. |
| `AgentBar` | 997 | Agent selection/configuration for generation: horizontal scrollable agent cards with checkbox, voice config popover per agent, shuffle random selection |
| `QuizView` | 985 | Full quiz interface: 5 question types, 4 phases (not_started → answering → grading → reviewing), score pie chart, per-question feedback, retry, draft persistence |
| `GenerationToolbar` | 893 | Toolbar: model selector, PDF upload, PDF provider selector, web search toggle, thinking config, media settings popover |
| `AudioSettings` | 799 | TTS + ASR combined settings: provider cards with logo, API key inputs, voice selector with preview |

**Full directory structure with responsibilities (all under `components/`):**

```
access-code-guard.tsx      — Fetches /api/access-code/status, shows modal if auth needed
access-code-modal.tsx      — Animated overlay: shield icon, input, submit, success checkmark
agent/
  agent-avatar.tsx         — Avatar: URL→AvatarImage or emoji→AvatarFallback, 3 sizes (sm/md/lg), color ring
  agent-bar.tsx            — Agent selection bar with per-agent voice config popover
  agent-config-panel.tsx   — Edit agent: name, role, persona textarea, color picker, priority slider
  agent-reveal-modal.tsx   — Staggered card flip animation revealing generated agents with role icons and sparkle particles
ai-elements/               — 19 Vercel AI SDK UI components (message, prompt-input, code-block, reasoning, sources, etc.)
audio/
  speech-button.tsx        — Mic toggle with waveform bars animation, long-press to record
  tts-config-popover.tsx   — Voice + speed selector popover
canvas/
  canvas-area.tsx          — Main slide display: SceneRenderer + Whiteboard overlay + play hint + CanvasToolbar
  canvas-toolbar.tsx       — Bottom toolbar: sidebar toggle, slide nav, play/pause, volume, speed, whiteboard, presentation, chat toggle
chat/
  chat-area.tsx            — Right panel: session tabs, message list, lecture notes toggle
  chat-session.tsx         — Individual session: messages with agent avatars, input, send/stop buttons
  inline-action-tag.tsx    — Clickable action buttons in messages (spotlight/highlight)
  lecture-notes-view.tsx   — Extracted speech texts as structured notes
  proactive-card.tsx       — Discussion invitation: topic text, accept/skip, animated border gradient
  session-list.tsx         — Horizontal tab bar for QA/discussion/lecture sessions
  use-chat-sessions.ts     — Master hook for all chat state + SSE streaming
generation/
  generating-progress.tsx  — Step progress with completion checkmarks
  generation-toolbar.tsx   — Model + PDF + search + thinking + media toolbar
  media-popover.tsx        — Image/video provider/model/API key configuration popover
  outlines-editor.tsx      — Edit outlines: drag reorder, delete, rename, type badges
header.tsx                 — Top bar: back arrow, title, settings, theme switcher, language, export menu
language-switcher.tsx      — Dropdown: 5 locales with native labels + short codes
roundtable/                — index.tsx (2094), presentation-speech-overlay.tsx (498), audio-indicator.tsx, constants.ts
scene-renderers/           — classroom-complete.tsx, interactive-renderer.tsx, pbl-renderer.tsx, pbl/ (6 files), quiz-renderer.tsx, quiz-view.tsx
server-providers-init.tsx  — Side-effect: fetches server providers on mount
settings/                  — 17 files total (see F8 Phase 6 for details)
slide-renderer/
  Editor/Canvas/           — index.tsx (415), 5 canvas sub-components, Operate/ (7 files), hooks/ (11 hooks)
  Editor/                  — HighlightOverlay, LaserOverlay, ScreenCanvas, ScreenElement, SpotlightOverlay, ZoomWrapper
  components/element/      — 10 element types (Text, Image, Shape, Line, Chart, Table, Latex, Video, Code) + hooks/
  components/ThumbnailSlide/, ThumbnailInteractive/
stage.tsx (1271)           — Master classroom orchestrator
stage/scene-renderer.tsx   — Routes scene.type → Canvas/QuizRenderer/InteractiveRenderer/PBLRenderer
stage/scene-sidebar.tsx    — Left sidebar: home, thumbnail list, generation progress, failed retry
ui/                        — 32 shadcn/ui primitives (see F3)
user-profile.tsx           — Expandable pill: avatar picker, name editor, bio textarea
whiteboard/                — index.tsx (container), whiteboard-canvas.tsx (445), whiteboard-history.tsx
```

### F3: shadcn/ui Component Library (32 primitives)

All in `components/ui/`, built on Radix primitives + CVA:

`alert-dialog` (184), `alert` (73), `avatar` (96), `avatar-display` (29), `badge` (45), `button` (67, variants: default/destructive/outline/secondary/ghost/link, sizes: default/sm/lg/icon), `button-group` (78), `card` (92), `carousel` (231, embla-carousel-react), `checkbox` (28), `collapsible` (21), `combobox` (275, cmdk + popover), `command` (180, cmdk), `context-menu` (239), `dialog` (142), `dropdown-menu` (242), `field` (224, @base-ui/react), `hover-card` (38), `input` (19), `input-group` (144), `label` (21), `popover` (31), `progress` (31), `scroll-area` (55), `select` (184), `separator` (28), `slider` (25), `sonner` (45, uses custom useTheme), `switch` (29), `tabs` (80), `textarea` (18), `tooltip` (57)

### F4: Hooks & Contexts (15 hooks, 2 contexts)

| Hook | Lines | Purpose |
|------|-------|---------|
| `useCanvasOperations` | 587 | Element CRUD, alignment, distribution, z-order, group/ungroup, clipboard, delete, select all |
| `useSceneGenerator` | 576 | Orchestrates scene generation: generateRemaining(), retrySingleOutline(), stop() |
| `useDiscussionTTS` | 343 | TTS during live discussions: queue speech chunks, play sequentially, handle interruption |
| `useAudioRecorder` | 325 | MediaRecorder API: start/stop, audio visualization, silence detection, output Blob |
| `useOrderElement` | 191 | Z-order operations: bring to front, send to back, move forward/backward |
| `useBrowserASR` | 155 | Web Speech API recognition: start/stop, interim/final results, language |
| `useBrowserTTS` | 150 | Web Speech API synthesis: speak, cancel, voice selection, speed/pitch |
| `useStreamingText` | 124 | Word-by-word text reveal from StreamBuffer, progress tracking (0→1) |
| `useDraftCache` | 95 | Generic localStorage cache for form drafts with TTL |
| `useTheme` | 71 | Theme context: light/dark/system, resolvedTheme, media query listener, localStorage |
| `useI18n` | 66 | I18n context: locale, setLocale, t(), browser detection, localStorage |
| `useSlideBackgroundStyle` | 54 | Computes CSS background from SlideBackground type |
| `useHistorySnapshot` | 41 | Wraps snapshot store: push, undo, redo, canUndo/canRedo |
| `useExportPPTX` | ~1000 | PowerPoint export: PPTElement[] → pptxgenjs calls, HTML→rich text, SVG→polygon, LaTeX→OMML |
| `useExportClassroom` | ~200 | ZIP export: manifest.json + scenes + audio + images + agents |

**Contexts:** `SceneContext` (211 lines — provides current scene data via `SceneProvider`, `useSceneData()`, `useSceneSelector()`), `MediaStageContext` (18 lines — provides stageId for IndexedDB keys)

### F5: CSS System, Animations & Theming

**globals.css (218 lines):** `@import 'tailwindcss'` + `'tw-animate-css'` + `'shadcn/tailwind.css'`. `@custom-variant dark`. `@theme inline` with 30+ oklch color tokens. `:root` light theme (--primary: #722ed1 purple). `.dark` theme (--primary: #8b47ea). `--radius: 0.625rem` base.

**6 Keyframe Animations:** `wave` (audio bars), `breathing-bar-1/2/3` (speech indicators), `shimmer` (skeleton loading), `interactive-mode-breathe` (button glow)

**Motion Patterns:** `<motion.div initial/animate/exit>` for enter/exit, `<AnimatePresence>` for conditional, spring physics (`damping:20, stiffness:300`), `layout` for reflows, `staggerChildren:0.1`, gesture (`whileHover scale:1.02`, `whileTap scale:0.97`)

### F6: Configuration Objects (13 config files)

`shapes.ts` (1031 lines, 20+ SVG path formulas), `symbol.ts` (~700, unicode categories), `animation.ts` (~200, enter/exit animation defs), `theme.ts` (~100, 10+ preset themes), `hotkey.ts` (~130, keyboard shortcuts), `image-clip.ts` (~170, clip path presets), `latex.ts` (~200, symbol palette), `chart.ts` (~70, chart type presets), `font.ts` (~40, font families), `lines.ts` (~40, line styles), `element.ts` (~10, default dimensions), `mime.ts` (~15, MIME mapping), `storage.ts` (~3, localStorage keys)

### F7: One-Shot Frontend Mega Prompt

```
Me: Build the complete frontend for MultiMind Classroom — every component, page, hook, and animation.

Do: Create 203 React components, 15 hooks, 2 contexts, 32 shadcn/ui primitives, and 13 config files:

PAGES: HomePage (890 lines, gradient bg, floating toolbar pill with theme/language/settings, centered logo animation, main card with auto-resize Textarea + InteractiveMode Atom toggle + GenerationToolbar + gradient Generate button, collapsible Recent Classrooms responsive grid with ThumbnailSlide previews + rename/delete + search, UserProfileCard expandable pill). GenerationPreviewPage (900 lines, vertical step list with 6 StepVisualizer animations: PdfScan laser, WebSearch globe+cards, StreamingOutlines stagger, AgentReveal flip-in cards, Content assembly, Actions sequence, AgentRevealModal with staggered rotateY flip + role icons + color borders). ClassroomPage (180 lines, loads from IndexedDB, renders Stage). WhiteboardEvalPage (60 lines, debug tool).

STAGE (1271 lines): PlaybackEngine lifecycle, ActionEngine+AudioPlayer wiring, discussion flow state machine, useDiscussionTTS integration, fullscreen container ref, AlertDialog confirmations. Layout: SceneSidebar (left) + Header+CanvasArea+Whiteboard (center) + ChatArea (right, resizable) + Roundtable (bottom overlay).

ROUNDTABLE (2094 lines): 12 motion.div voice waveform bars (peaks 14-27px, durations 0.53-0.78s), agent avatar ring with color+speaking pulse, Textarea chat input + Send ArrowUp + Mic toggle, ProactiveCard gradient border, slide ChevronLeft/Right nav, Play/Pause toggle, speed dropdown 1×/1.25×/1.5×/2×, Repeat restart, Volume slider+mute, PencilLine whiteboard toggle, Maximize2 presentation mode, Loader2 thinking state, PresentationSpeechOverlay word-by-word reveal.

CANVAS: Editor/Canvas (415 lines) with 11 hooks (viewport, select, drag, scale with 8 handles, rotate, mouse selection, line drag, keypoint move, create, drop, common). 10 element renderers (Text/ProseMirror with 10 marks, Image with clip+filters, Shape with SVG paths+gradients, Line with bezier+markers, Chart/ECharts, Table, Latex/KaTeX, Video, Code/Shiki). 7 Operate overlays. ThumbnailSlide. ScreenElement/ScreenCanvas. Spotlight/Highlight/Laser overlays.

WHITEBOARD: Container with slide-up animation, toolbar (Eraser+History+Close), WhiteboardCanvas (pan/zoom, AnimatedElement staggered entrance scale 0→1 delay index*0.06s, cascade exit reverse-order rotate+scale→0), WhiteboardHistory snapshot timeline.

CHAT: useChatSessions (1525 lines, SSE streaming, abort, IndexedDB persistence), ChatArea (session tabs + message list + lecture notes), ChatSession (agent avatars+colors, markdown, inline action tags), ProactiveCard (animated gradient border), LectureNotesView.

QUIZ: QuizView (985 lines, 5 question types: single-choice radio, multiple-choice checkbox, fill-in-blank input, true-false toggle, short-answer textarea+voice. 4 phases: not_started→answering→grading→reviewing. Score pie chart, feedback accordion, draft persistence).

SETTINGS: SettingsDialog (1143 lines, 10 tabs with left nav icons, ProviderListColumn pattern), 17 sub-files. ModelSelector (Combobox with search, provider grouping, capability badges Eye/Wrench/Brain). AgentBar (997 lines, scrollable cards, voice config hierarchy popover). TTSSettings (1264 lines), AudioSettings (799 lines), ASRSettings (559 lines).

SCENE RENDERERS: ClassroomComplete (confetti 7 colors, scene type breakdown, score summary), InteractiveRenderer (iframe sandbox + postMessage), PBLRenderer + 6 sub-components (RoleSelection, ChatPanel, IssueboardPanel, Workspace, Guide).

UI PRIMITIVES: 32 shadcn/ui components on Radix + CVA. Button variants (default/destructive/outline/secondary/ghost/link, sizes default/sm/lg/icon). Full dark mode. cn() utility everywhere.

All components use: cn() for conditional classes, useI18n() for all text, motion/react for animations, lucide-react for icons, readonly props, sonner for toasts, controlled state (useState).
```

### F8: Multi-Shot Frontend Build (6 phases)

#### F-Phase 1: UI Primitives & Layout Shell

```
Me: Build the UI foundation — shadcn/ui components, layout shell, and page routing.

Do:
1. Create all 32 shadcn/ui components in components/ui/ with Radix primitives, CVA variants, cn() utility
2. Create globals.css: @import tailwindcss + tw-animate-css + shadcn/tailwind.css. @custom-variant dark. @theme inline with 30+ oklch tokens. :root light (--primary:#722ed1). .dark (--primary:#8b47ea). 6 keyframe animations. scrollbar-hide utility. ProseMirror styles.
3. Create App.tsx: BrowserRouter → ThemeProvider → I18nProvider → ServerProvidersInit → AccessCodeGuard → Suspense → Routes → Toaster
4. Create page shells for all 4 routes
5. Create Header: back arrow, settings gear, theme switcher (Sun/Moon/Monitor cycle), LanguageSwitcher, export dropdown
6. Create AccessCodeGuard + AccessCodeModal (animated overlay, shield icon, input, success animation)
7. Create UserProfileCard: collapsible pill, avatar grid (12 SVGs + upload), nickname edit, bio textarea, Motion expand/collapse
8. Create LanguageSwitcher: dropdown with 5 locales, click-outside close
9. Create cn() utility, createLogger(), all 13 config files (shapes.ts 1031 lines, animation.ts, theme.ts, hotkey.ts, image-clip.ts, latex.ts, chart.ts, font.ts, lines.ts, element.ts, mime.ts, storage.ts, symbol.ts)
```

#### F-Phase 2: HomePage & Generation Flow

```
Me: Build the HomePage and GenerationPreviewPage with all interactions.

Do:
1. HomePage (890 lines): gradient bg, fixed toolbar pill, centered logo animation, main card with Textarea + InteractiveMode toggle (Atom breathing) + GenerationToolbar + gradient Generate button
2. GenerationToolbar: inline model selector, PDF upload (Paperclip + badge), PDF provider Select, web search Globe toggle, thinking Brain popover, MediaPopover
3. Recent Classrooms: collapsible chevron, responsive grid, ClassroomCard (ThumbnailSlide, metadata badge, name tooltip+copy, rename, delete overlay confirmation)
4. Search: InputGroup with Search icon, real-time filter, AnimatePresence
5. GenerationPreviewPage (900 lines): vertical step list, 6 StepVisualizers (PdfScan laser, WebSearch globe+cards, StreamingOutlines stagger, AgentGeneration, Content, Actions)
6. AgentRevealModal: full-screen overlay, staggered flip-in (rotateY), role icons (👨‍🏫/📚/🎓), color borders, auto-continue
7. OutlinesEditor: drag-reorder, delete/rename, type badges
8. Wire flow: HomePage form → sessionStorage → GenerationPreviewPage SSE → IndexedDB → navigate /classroom/:id
```

#### F-Phase 3: Slide Renderer & Canvas System

```
Me: Build the full slide renderer with all 10 element types and interactive canvas.

Do:
1. Editor/Canvas (415 lines): viewport scaling, element rendering loop, mouse events
2. 11 canvas hooks: useViewportSize, useSelectElement, useDragElement, useScaleElement (8 resize handles), useRotateElement, useMouseSelection, useDragLineElement, useMoveShapeKeypoint, useInsertFromCreateSelection, useDrop, useCommonOperate
3. 10 element renderers: TextElement (ProseMirror with paragraph/heading/bulletList/orderedList + 10 marks: bold/italic/underline/strikethrough/forecolor/backcolor/fontsize/fontname/textAlign/lineHeight/subscript/superscript/link), ImageElement (clip-path + CSS filters + flip), ShapeElement (SVG path formulas + gradient/pattern fills), LineElement (cubic bezier + arrow markers), ChartElement (ECharts), TableElement, LatexElement (KaTeX + Temml), VideoElement, CodeElement (Shiki 50+ grammars)
4. 7 Operate overlays per element type
5. ThumbnailSlide, ScreenElement/ScreenCanvas, ViewportBackground
6. HighlightOverlay, LaserOverlay, SpotlightOverlay
```

#### F-Phase 4: Classroom Layout

```
Me: Build the classroom layout — Stage orchestrator, sidebar, canvas area, chat.

Do:
1. Stage (1271 lines): PlaybackEngine lifecycle, discussion flow, useDiscussionTTS, sidebar/chat/whiteboard state
2. SceneSidebar (559 lines): home button, ThumbnailSlide list, active highlight, generation progress, failed retry, collapse animation
3. CanvasArea (274 lines): SceneRenderer routing, Whiteboard overlay, play hint, CanvasToolbar
4. CanvasToolbar (440 lines): sidebar toggle, slide nav, play/pause, speed dropdown, volume slider+mute, whiteboard toggle, chat toggle, presentation mode, stop discussion
5. ChatArea (340 lines): resizable right panel (drag handle min 280px max 500px), session tabs, message list, lecture notes
6. ChatSession (367 lines): agent avatar+color bubbles, markdown, inline action tags, input+send/stop
7. SessionList, ProactiveCard (gradient border animation), InlineActionTag, LectureNotesView
```

#### F-Phase 5: Roundtable, Whiteboard & Interactive

```
Me: Build Roundtable, Whiteboard, and interactive renderers.

Do:
1. Roundtable (2094 lines): 12-bar waveform, agent avatars with speaking pulse, chat input+Send+Mic, ProactiveCard, slide nav, playback controls, volume, speed, whiteboard toggle, presentation mode, thinking state
2. PresentationSpeechOverlay (498 lines): fullscreen speech word-by-word reveal, agent avatar, breathing bars
3. Whiteboard container: AnimatePresence slide-up, toolbar (Eraser/History/Close), element count badge
4. WhiteboardCanvas (445 lines): pan+zoom, AnimatedElement staggered entrance (scale 0→1, delay index*0.06s), cascade exit
5. WhiteboardHistory: snapshot timeline with thumbnails and restore
6. InteractiveRenderer: iframe sandbox + postMessage, QuizRenderer → QuizView (985 lines, 5 question types, 4 phases)
7. ClassroomComplete: confetti (7 colors), scene type breakdown, score summary, encouragement
8. PBL components: PBLRenderer, RoleSelection, ChatPanel, IssueboardPanel, Workspace, Guide
```

#### F-Phase 6: Settings, Agents & Export

```
Me: Build Settings dialog, Agent system UI, and Export UI.

Do:
1. SettingsDialog (1143 lines): two-column (left nav 10 tabs + right content), ProviderListColumn pattern
2. ProviderConfigPanel (438 lines): API key, base URL, model list, test connection
3. ModelSelector (423 lines): Combobox search, provider grouping, capability badges (Eye/Wrench/Brain)
4. ModelEditDialog, AddProviderDialog, AddAudioProviderDialog
5. All settings sub-pages: GeneralSettings, ImageSettings, VideoSettings, TTSSettings (1264 lines), ASRSettings (559), AudioSettings (799), PDFSettings (303), WebSearchSettings
6. AgentBar (997 lines): scrollable cards, voice config popover with provider→model→voice hierarchy + search
7. AgentAvatar, AgentConfigPanel (persona editor, color picker, priority slider, action checkboxes)
8. Export UI in Header: PPTX (useExportPPTX), ZIP (useExportClassroom), Import (file input), loading toasts
9. MediaPopover (460 lines): image/video provider+model+key, enable toggles, aspect ratio
10. GeneratingProgress: step indicators with elapsed time
```