| # Pitch Deck Outline |
|
|
| Use this as the slide plan for the required presentation deck. |
|
|
| ## Slide 1 - Title |
|
|
| ElevenClip.AI |
|
|
| AI clip studio for turning long-form videos into personalized short-form clips. |
|
|
| Include: |
|
|
| - AMD Developer Hackathon |
| - Track 3 - Vision & Multimodal AI |
| - GitHub URL |
| - Hugging Face Space URL |
|
|
| ## Slide 2 - Problem |
|
|
| Long-form creators need short-form distribution, but editing clips manually is slow. |
|
|
| Key points: |
|
|
| - Two-hour videos can take hours to review. |
| - Good clips depend on audience, niche, tone, and platform. |
| - Subtitles and vertical export add repetitive work. |
|
|
| ## Slide 3 - Solution |
|
|
| ElevenClip.AI automates the first editing pass. |
|
|
| Workflow: |
|
|
| Video input -> Whisper transcript -> Qwen highlight scoring -> ffmpeg clip rendering -> human review/editor -> downloads |
|
|
| ## Slide 4 - Product Demo |
|
|
| Show screenshots or short GIFs of: |
|
|
| - Channel profile |
| - Pipeline progress |
| - Transcript/highlights |
| - Clip editor |
| - Approved/downloaded clips |
|
|
| ## Slide 5 - AI Architecture |
|
|
| Model roles: |
|
|
| - Whisper Large V3: multilingual transcription, including Thai. |
| - Qwen2.5-7B-Instruct: profile-aware highlight detection. |
| - Qwen2-VL-7B-Instruct: visual reactions, scene changes, and on-screen text. |
| - ffmpeg: subtitle burn-in and platform export. |
|
|
| ## Slide 6 - AMD + ROCm |
|
|
| Why AMD matters: |
|
|
| - Long videos need high-throughput inference. |
| - MI300X memory helps with large models and long transcripts. |
| - ROCm + PyTorch enables Whisper inference. |
| - vLLM ROCm enables faster Qwen serving. |
|
|
| ## Slide 7 - Benchmark |
|
|
| Replace placeholders after cloud credits arrive. |
|
|
| | Run | Hardware | Total Time | Clips | |
| | --- | --- | ---: | ---: | |
| | CPU baseline | CPU | TBD | 10 | |
| | AMD GPU | MI300X + ROCm | TBD | 10 | |
|
|
| Goal: 2-hour video -> 10 subtitled clips in under 10 minutes on MI300X. |
|
|
| ## Slide 8 - Business Value |
|
|
| Target users: |
|
|
| - YouTubers |
| - Podcasters |
| - Educators |
| - Streamers |
| - Agencies |
| - Brand marketing teams |
|
|
| Value: |
|
|
| - Save editing time. |
| - Increase short-form output. |
| - Keep creator control. |
| - Support multilingual creators. |
|
|
| ## Slide 9 - What We Built |
|
|
| Current MVP: |
|
|
| - FastAPI backend |
| - React editor |
| - YouTube/upload input |
| - Demo pipeline |
| - Clip rendering and subtitles |
| - Hugging Face Space |
| - AMD deployment plan |
|
|
| Next: |
|
|
| - Real Whisper + Qwen on MI300X |
| - Qwen2-VL frame analysis |
| - Benchmark table |
| - Better subtitle styling presets |
|
|
|
|