Commit Β·
549efd4
1
Parent(s): 277d6c0
docs: pitch deck + demo video script + lablab submission form content
Browse filesThree pure-content deliverables that don't depend on AMD Dev Cloud
credits being live β Lucas can paste these directly when ready.
- docs/pitch-deck.md β 8-slide deck, slide-by-slide content + visual
notes. Built around the four judging criteria. Closes on the
substrate-not-product framing.
- docs/demo-video-script.md β 2:30 shot list, voice-over script,
recording order, editing checklist, export checklist.
- docs/lablab-submission-form.md β copy-paste content for every
field on lablab.ai's submission form, with character counts
pre-validated and tags pre-selected.
- docs/demo-video-script.md +162 -0
- docs/lablab-submission-form.md +139 -0
- docs/pitch-deck.md +166 -0
docs/demo-video-script.md
ADDED
|
@@ -0,0 +1,162 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# SignBridge β Demo Video Script
|
| 2 |
+
|
| 3 |
+
> Target length: **2:30 (β€ 3 min)**. Format: 1080p MP4, MP3 audio. Aspect ratio 16:9.
|
| 4 |
+
> Tools: QuickTime Player (Mac) for screen + camera capture, iMovie or CapCut for editing.
|
| 5 |
+
|
| 6 |
+
---
|
| 7 |
+
|
| 8 |
+
## Story arc (3 acts)
|
| 9 |
+
|
| 10 |
+
| Time | Act | Beat |
|
| 11 |
+
|---|---|---|
|
| 12 |
+
| 0:00β0:20 | **Hook** | Open with the human problem; viewer must feel the gap. |
|
| 13 |
+
| 0:20β1:30 | **Demo** | Live SignBridge in action β both fingerspelling AND a motion sign. |
|
| 14 |
+
| 1:30β2:30 | **Why AMD + close** | Architecture diagram + concrete MI300X comparison + open-source ethics + URL. |
|
| 15 |
+
|
| 16 |
+
Hard rule: **no slide-by-slide voice-over reading**. The demo should *play live*; voice-over should narrate what we're seeing, not summarise text on screen.
|
| 17 |
+
|
| 18 |
+
---
|
| 19 |
+
|
| 20 |
+
## Shot list
|
| 21 |
+
|
| 22 |
+
### Act 1 β Hook (0:00 β 0:20)
|
| 23 |
+
|
| 24 |
+
**Visual A (5 s):** Plain background, bold text card fades in:
|
| 25 |
+
> 70 million deaf people. Interpreters cost $50β200 / hour. They're scarce.
|
| 26 |
+
|
| 27 |
+
**Visual B (5 s):** Text card β "What if your phone could just translate?"
|
| 28 |
+
|
| 29 |
+
**Visual C (10 s):** Camera shot of you (Lucas) in a quiet room, signing HELLO at the camera silently. No voice-over yet. Hold the silence β let the viewer feel that the sign means nothing to them.
|
| 30 |
+
|
| 31 |
+
**Voice-over:** *(starts at 0:15)*
|
| 32 |
+
> "Most of us can't read this. SignBridge can."
|
| 33 |
+
|
| 34 |
+
---
|
| 35 |
+
|
| 36 |
+
### Act 2 β Live demo (0:20 β 1:30)
|
| 37 |
+
|
| 38 |
+
**Setup (0:20 β 0:25):** 5-second screen-recording of the live HF Space loading at `huggingface.co/spaces/lablab-ai-amd-developer-hackathon/signbridge`. URL bar visible. Tabs visible: "Snapshot" and "Record sign". This proves it's a live deployed product, not a slide deck.
|
| 39 |
+
|
| 40 |
+
**Beat 2A β Fingerspelling (0:25 β 0:55):**
|
| 41 |
+
|
| 42 |
+
**Visual (split screen recommended):** Left = your face/hand on webcam, right = the Gradio app receiving frames.
|
| 43 |
+
- Sign **L** clearly. Click **Capture sign**. App shows "detected: L (85%)".
|
| 44 |
+
- Sign **U**. Capture.
|
| 45 |
+
- Sign **C**. Capture.
|
| 46 |
+
- Sign **A**. Capture.
|
| 47 |
+
- Sign **S**. Capture.
|
| 48 |
+
- Click **π Speak**. App composes β speaks: **"Lucas."**
|
| 49 |
+
|
| 50 |
+
**Voice-over during this beat:**
|
| 51 |
+
> "First, fingerspelling. I sign each letter, the app captures it, andβ" *(pause for the speak)* β *"composed in natural English."*
|
| 52 |
+
|
| 53 |
+
**Beat 2B β Motion sign (0:55 β 1:25):**
|
| 54 |
+
|
| 55 |
+
**Visual:** Switch tabs to **Record sign**. Hit Record, sign **HELLO** (the wave-from-forehead motion), stop, click Submit.
|
| 56 |
+
- Detected: **hello (85%)**. Click Speak.
|
| 57 |
+
- App says: **"Hello."**
|
| 58 |
+
|
| 59 |
+
Repeat one more sign for variety: **THANK_YOU**.
|
| 60 |
+
|
| 61 |
+
**Voice-over:**
|
| 62 |
+
> "But fingerspelling alone isn't real ASL β most signs are *motion*. Hold-to-record captures the whole gesture, not just one frame. The system detects the motion across frames and..." *(pause for the speak)*
|
| 63 |
+
|
| 64 |
+
**Beat 2C β Two-person scene (1:25 β 1:30):** *(optional but high-impact)*
|
| 65 |
+
|
| 66 |
+
**Visual:** You sign something to a hearing person; they hear the AI say it; they react. Hold the human reaction for 2 seconds.
|
| 67 |
+
|
| 68 |
+
**No voice-over** during this beat β let the moment land.
|
| 69 |
+
|
| 70 |
+
---
|
| 71 |
+
|
| 72 |
+
### Act 3 β Architecture + AMD pitch (1:30 β 2:30)
|
| 73 |
+
|
| 74 |
+
**Beat 3A β Architecture diagram (1:30 β 1:55):**
|
| 75 |
+
|
| 76 |
+
**Visual:** Static slide showing the pipeline:
|
| 77 |
+
```
|
| 78 |
+
Webcam frames β Qwen3-VL-8B (vision) β Llama-3.1-8B (composer) β XTTS-v2 (speech)
|
| 79 |
+
All on a single AMD Instinct MI300X
|
| 80 |
+
```
|
| 81 |
+
|
| 82 |
+
**Voice-over:**
|
| 83 |
+
> "Under the hood: a multi-modal pipeline running on a single AMD Instinct MI300X. Vision, reasoning, and voice β all concurrent on one GPU."
|
| 84 |
+
|
| 85 |
+
**Beat 3B β The MI300X comparison (1:55 β 2:15):**
|
| 86 |
+
|
| 87 |
+
**Visual:** The comparison table from the walkthrough:
|
| 88 |
+
|
| 89 |
+
| | MI300X 1Γ | H100 80 GB |
|
| 90 |
+
|---|---|---|
|
| 91 |
+
| V1 pipeline (~34 GB) | β
comfortable | β tight |
|
| 92 |
+
| V2 with Llama-3.1-70B FP8 (~70 GB extra) | β
still fits | β doesn't fit |
|
| 93 |
+
|
| 94 |
+
**Voice-over:**
|
| 95 |
+
> "192 GB of HBM3. Same workload on NVIDIA H100 needs three GPUs. Practical accessibility tools running globally need the cost-and-availability profile that AMD enables."
|
| 96 |
+
|
| 97 |
+
**Beat 3C β Substrate + close (2:15 β 2:30):**
|
| 98 |
+
|
| 99 |
+
**Visual:** Final slide:
|
| 100 |
+
- "Open source, MIT β github.com/seekerPrice/signbridge"
|
| 101 |
+
- "Hugging Face Space β huggingface.co/spaces/lablab-ai-amd-developer-hackathon/signbridge"
|
| 102 |
+
- "ASL V1. Deaf-led teams own the rest."
|
| 103 |
+
- π€ SignBridge
|
| 104 |
+
|
| 105 |
+
**Voice-over:**
|
| 106 |
+
> "SignBridge is open source under MIT. It's a substrate β Deaf-led organisations deploy it for their own languages. The hardest part of accessibility isn't building. It's deploying. AMD makes the deploying possible. Thanks for watching."
|
| 107 |
+
|
| 108 |
+
---
|
| 109 |
+
|
| 110 |
+
## Voice-over recording tips
|
| 111 |
+
|
| 112 |
+
- Record voice **separately** from screen capture (better audio quality). Use QuickTime "New Audio Recording" with a mic 6β12 inches away.
|
| 113 |
+
- One take, then cut. Don't try to dub multiple takes line-by-line.
|
| 114 |
+
- Cadence: ~140 words/min. Pause for 0.5 s after each section.
|
| 115 |
+
- If you have a good pop filter / lavalier, use it. AirPods Pro built-in mic is workable but compresses dynamics.
|
| 116 |
+
|
| 117 |
+
---
|
| 118 |
+
|
| 119 |
+
## Editing notes
|
| 120 |
+
|
| 121 |
+
- **Captions/subtitles required.** Burn in the spoken English text below the speaker's face throughout β both for accessibility and so judges can follow with sound off.
|
| 122 |
+
- **Highlight the recognized token visually.** When the app shows "detected: hello (85%)", zoom in or add a brief highlight box on that text β judges' eyes need to find it fast.
|
| 123 |
+
- **Music: skip.** The demo is loud enough on its own; background music distracts from the speech-output beats.
|
| 124 |
+
- **Smooth transitions only** β don't use fancy wipes; cut on action.
|
| 125 |
+
- **Final cut export:** 1080p, H.264, MP4, β€100 MB if possible (lablab uploader has size limits).
|
| 126 |
+
|
| 127 |
+
---
|
| 128 |
+
|
| 129 |
+
## Prep before recording
|
| 130 |
+
|
| 131 |
+
- [ ] AMD Dev Cloud credit landed (so the live demo uses MI300X β *this is the hackathon talk-track*); fall back to HF Inference if not.
|
| 132 |
+
- [ ] Lighting: front-facing soft light. No back-window glare.
|
| 133 |
+
- [ ] Plain background (white wall ideal).
|
| 134 |
+
- [ ] Wear a contrasting solid colour (not patterns) β VLM accuracy improves.
|
| 135 |
+
- [ ] Webcam height: at eye level. Hands need to be in frame for signs.
|
| 136 |
+
- [ ] Test the live HF Space URL once before recording. If it errors, fix before pressing record.
|
| 137 |
+
- [ ] One dry run end-to-end with a stopwatch. Trim if over 2:45.
|
| 138 |
+
|
| 139 |
+
---
|
| 140 |
+
|
| 141 |
+
## Recording order (don't shoot in story order)
|
| 142 |
+
|
| 143 |
+
1. **Live demo screen recording first** β 3 takes of the full demo flow, pick the cleanest.
|
| 144 |
+
2. **Voice-over second** β record continuous narration over the picked demo take.
|
| 145 |
+
3. **B-roll of you signing alone** (Act 1 silent shot, Act 2C two-person reaction) β last, since they're easier to re-shoot.
|
| 146 |
+
4. Edit it together in iMovie / CapCut.
|
| 147 |
+
5. Export.
|
| 148 |
+
6. Upload to YouTube as **Unlisted**, copy URL.
|
| 149 |
+
7. Paste URL into lablab.ai submission form's "Video Presentation" field.
|
| 150 |
+
|
| 151 |
+
---
|
| 152 |
+
|
| 153 |
+
## Export checklist
|
| 154 |
+
|
| 155 |
+
- [ ] Length 2:00β3:00
|
| 156 |
+
- [ ] Captions visible throughout
|
| 157 |
+
- [ ] AMD Dev Cloud / MI300X mentioned by name β₯3 times
|
| 158 |
+
- [ ] HF Space URL shown on screen at least once
|
| 159 |
+
- [ ] GitHub URL shown on screen at least once
|
| 160 |
+
- [ ] No copyrighted music / footage
|
| 161 |
+
- [ ] Speaker face visible (judges remember faces)
|
| 162 |
+
- [ ] Final shot: SignBridge logo + URLs
|
docs/lablab-submission-form.md
ADDED
|
@@ -0,0 +1,139 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# SignBridge β lablab.ai Submission Form Content
|
| 2 |
+
|
| 3 |
+
> Open https://lablab.ai/ai-hackathons/amd-developer β scroll to bottom β click **Submit project**. Paste each field below into the matching input.
|
| 4 |
+
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
+
## Project Title (β€ ~70 chars)
|
| 8 |
+
|
| 9 |
+
```
|
| 10 |
+
SignBridge β Real-time ASL β English speech on AMD Instinct MI300X
|
| 11 |
+
```
|
| 12 |
+
|
| 13 |
+
(63 characters; safe under platform limit.)
|
| 14 |
+
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
+
## Short Description (β€ 150 chars typical)
|
| 18 |
+
|
| 19 |
+
```
|
| 20 |
+
Two people who couldn't communicate, now can. Real-time ASL β English speech via Qwen3-VL + Llama-3.1 + XTTS, on a single AMD MI300X.
|
| 21 |
+
```
|
| 22 |
+
|
| 23 |
+
(132 characters.)
|
| 24 |
+
|
| 25 |
+
---
|
| 26 |
+
|
| 27 |
+
## Long Description (no hard limit, ~300 words is the sweet spot)
|
| 28 |
+
|
| 29 |
+
```
|
| 30 |
+
SignBridge is a real-time American Sign Language to English speech translator built for the AMD Developer Hackathon, Track 3 (Vision & Multimodal AI).
|
| 31 |
+
|
| 32 |
+
The user signs at the webcam β either fingerspelled letters (Snapshot tab) or full motion words (Record sign tab) β and SignBridge replies in spoken English. Two people who couldn't communicate, now can.
|
| 33 |
+
|
| 34 |
+
Architecture: a multi-stage pipeline (Qwen3-VL-8B for sign recognition, Llama-3.1-8B for sentence composition, Coqui XTTS-v2 for speech synthesis), running concurrently on a single AMD Instinct MI300X via vLLM. The 192 GB HBM3 of one MI300X holds the entire pipeline with margin β the same workload on NVIDIA H100 needs three GPUs.
|
| 35 |
+
|
| 36 |
+
For motion-dependent signs (HELLO, THANK_YOU, PLEASE, EAT) the Record-sign tab captures 1.5 s of webcam, samples 4 evenly-spaced frames, and sends them as a multi-image VLM call with NVIDIA-style sequential frame markers in the prompt β most ASL signs are motion, not held poses, so single-frame approaches fundamentally cannot translate them.
|
| 37 |
+
|
| 38 |
+
Why this matters: sign-language interpreters cost $50β200 per hour and are scarce. Courts, hospitals, schools, and public services must by law (ADA, EAA 2025) provide interpretation. Sorenson VRS β the dominant relay-services provider β books $4B+ in annual revenue filling this gap. SignBridge is an open-source MIT-licensed substrate that any Deaf-led NGO, school, ministry, or enterprise can deploy on their own AMD compute.
|
| 39 |
+
|
| 40 |
+
V1 is ASL-only, deliberately. Sign languages aren't interchangeable β BSL, MSL, CSL, ISL, and 200+ others each deserve their own teams, training data, and Deaf community leadership. (See Bragg et al., "Systemic Biases in Sign Language AI Research", arXiv 2403.02563.)
|
| 41 |
+
|
| 42 |
+
Built solo by Lucas Loo Tan Yu Heng, May 5β11, 2026.
|
| 43 |
+
```
|
| 44 |
+
|
| 45 |
+
---
|
| 46 |
+
|
| 47 |
+
## Technology & Category Tags
|
| 48 |
+
|
| 49 |
+
Pick from lablab's tag dropdown β these are the tags that match SignBridge:
|
| 50 |
+
|
| 51 |
+
**Primary (must-haves):**
|
| 52 |
+
- `AMD Developer Cloud`
|
| 53 |
+
- `AMD ROCm`
|
| 54 |
+
- `HuggingFace Spaces`
|
| 55 |
+
|
| 56 |
+
**Secondary (relevant):**
|
| 57 |
+
- `LLaMA` (Llama-3.1-8B composer)
|
| 58 |
+
- `Qwen` (Qwen3-VL-8B vision)
|
| 59 |
+
- `Gradio`
|
| 60 |
+
- `FastAPI`
|
| 61 |
+
- `Vision`
|
| 62 |
+
- `Multimodal`
|
| 63 |
+
- `Accessibility`
|
| 64 |
+
- `Open Source`
|
| 65 |
+
|
| 66 |
+
**Track:** Track 3 β Vision & Multimodal AI
|
| 67 |
+
|
| 68 |
+
---
|
| 69 |
+
|
| 70 |
+
## Cover Image
|
| 71 |
+
|
| 72 |
+
Upload `assets/cover.png` from the repo (1280Γ640 PNG, ~60 KB).
|
| 73 |
+
|
| 74 |
+
If lablab requires a different aspect ratio (e.g. square 1:1), regenerate with `python -m signbridge.scripts.make_cover` after editing the `WIDTH, HEIGHT` constants in `signbridge/scripts/make_cover.py`.
|
| 75 |
+
|
| 76 |
+
---
|
| 77 |
+
|
| 78 |
+
## Video Presentation
|
| 79 |
+
|
| 80 |
+
Paste the YouTube URL of the demo video (uploaded as **Unlisted**).
|
| 81 |
+
|
| 82 |
+
Reference content: `docs/demo-video-script.md`.
|
| 83 |
+
|
| 84 |
+
---
|
| 85 |
+
|
| 86 |
+
## Slide Presentation
|
| 87 |
+
|
| 88 |
+
Upload the deck PDF.
|
| 89 |
+
|
| 90 |
+
Reference content: `docs/pitch-deck.md`. Build in Google Slides, File β Download β PDF, upload here.
|
| 91 |
+
|
| 92 |
+
---
|
| 93 |
+
|
| 94 |
+
## Public GitHub Repository
|
| 95 |
+
|
| 96 |
+
```
|
| 97 |
+
https://github.com/seekerPrice/signbridge
|
| 98 |
+
```
|
| 99 |
+
|
| 100 |
+
---
|
| 101 |
+
|
| 102 |
+
## Demo Application Platform
|
| 103 |
+
|
| 104 |
+
```
|
| 105 |
+
Hugging Face Space
|
| 106 |
+
```
|
| 107 |
+
|
| 108 |
+
---
|
| 109 |
+
|
| 110 |
+
## Application URL
|
| 111 |
+
|
| 112 |
+
```
|
| 113 |
+
https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/signbridge
|
| 114 |
+
```
|
| 115 |
+
|
| 116 |
+
---
|
| 117 |
+
|
| 118 |
+
## Final pre-submit checklist
|
| 119 |
+
|
| 120 |
+
Before clicking Submit on lablab:
|
| 121 |
+
|
| 122 |
+
- [ ] Title pasted (63 chars)
|
| 123 |
+
- [ ] Short description pasted (132 chars)
|
| 124 |
+
- [ ] Long description pasted (~300 words)
|
| 125 |
+
- [ ] Tags selected (Track 3 + at minimum: AMD Developer Cloud, AMD ROCm, HuggingFace Spaces, Qwen, LLaMA)
|
| 126 |
+
- [ ] Cover image uploaded (assets/cover.png)
|
| 127 |
+
- [ ] Video URL pasted (YouTube unlisted)
|
| 128 |
+
- [ ] Pitch deck PDF uploaded
|
| 129 |
+
- [ ] GitHub URL pasted
|
| 130 |
+
- [ ] HF Space URL pasted
|
| 131 |
+
- [ ] **Track selection: Track 3 β Vision & Multimodal AI**
|
| 132 |
+
- [ ] HF Space loads from a fresh browser (incognito test)
|
| 133 |
+
- [ ] GitHub repo has a clean README
|
| 134 |
+
- [ ] LICENSE file is MIT
|
| 135 |
+
- [ ] All commits pushed to both remotes
|
| 136 |
+
|
| 137 |
+
When all boxes are ticked β click Submit β wait for confirmation email β done.
|
| 138 |
+
|
| 139 |
+
Time-target: submit by **2026-05-11 02:00 MYT** (1-hour buffer before the 03:00 cutoff).
|
docs/pitch-deck.md
ADDED
|
@@ -0,0 +1,166 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# SignBridge β Pitch Deck (8 slides)
|
| 2 |
+
|
| 3 |
+
> Open a Google Slides deck (or Pitch). Paste each slide's content into the matching blank slide. Visuals are described in italics β replace with actual screenshots / diagrams / table renders.
|
| 4 |
+
> Aspect ratio: 16:9. Theme: indigoβpink gradient (matches HF Space card).
|
| 5 |
+
|
| 6 |
+
---
|
| 7 |
+
|
| 8 |
+
## Slide 1 β Title
|
| 9 |
+
|
| 10 |
+
**Title (huge):**
|
| 11 |
+
SignBridge
|
| 12 |
+
|
| 13 |
+
**Subtitle:**
|
| 14 |
+
Real-time ASL β English speech, on a single AMD Instinct MI300X.
|
| 15 |
+
|
| 16 |
+
**Footer (small):**
|
| 17 |
+
Track 3 Β· Vision & Multimodal AI Β· AMD Developer Hackathon 2026 Β· Lucas Loo Tan Yu Heng
|
| 18 |
+
|
| 19 |
+
*Visual: the cover.png we already shipped (1280Γ640 indigoβpink gradient with π€ + project name).*
|
| 20 |
+
|
| 21 |
+
---
|
| 22 |
+
|
| 23 |
+
## Slide 2 β The problem
|
| 24 |
+
|
| 25 |
+
**Headline:**
|
| 26 |
+
70 million deaf people. Sign-language interpreters cost $50β200 per hour. They're scarce.
|
| 27 |
+
|
| 28 |
+
**Body bullets:**
|
| 29 |
+
- Courts, hospitals, schools, public services **must by law** provide interpretation (ADA Title II/III in the US; European Accessibility Act 2025 in the EU).
|
| 30 |
+
- **Sorenson VRS**, the dominant sign-language relay-services provider, books **$4B+ in annual revenue** filling this gap β proof the demand is enormous and budgeted-for.
|
| 31 |
+
- Existing AI alternatives (Be My Eyes, Microsoft Seeing AI) are turn-based, photo-only, English-default, and closed-source. Real ASL is *motion* β they fundamentally can't translate "HELLO" or "THANK YOU".
|
| 32 |
+
|
| 33 |
+
*Visual: a row of three context icons β courthouse / hospital / classroom β labeled with the mandates.*
|
| 34 |
+
|
| 35 |
+
---
|
| 36 |
+
|
| 37 |
+
## Slide 3 β The solution
|
| 38 |
+
|
| 39 |
+
**Headline:**
|
| 40 |
+
Hold to record. Sign. Speak.
|
| 41 |
+
|
| 42 |
+
**Body (3-step arc):**
|
| 43 |
+
1. **Hold-to-record button** captures 1.5 seconds of your sign.
|
| 44 |
+
2. A multi-stage pipeline (vision β reasoning β speech) translates it.
|
| 45 |
+
3. The other person hears natural English.
|
| 46 |
+
|
| 47 |
+
**Tag line under the arc:**
|
| 48 |
+
Two people who couldn't communicate, now can.
|
| 49 |
+
|
| 50 |
+
*Visual: 3 screenshots of the live Gradio Space β (a) user signing into webcam; (b) "detected: HELLO (85%)"; (c) audio waveform playing "Hello.".*
|
| 51 |
+
*If single screenshot: just the Gradio "Record sign" tab mid-demo.*
|
| 52 |
+
|
| 53 |
+
---
|
| 54 |
+
|
| 55 |
+
## Slide 4 β Architecture (the AMD pitch)
|
| 56 |
+
|
| 57 |
+
**Headline:**
|
| 58 |
+
The whole pipeline fits on a single MI300X. NVIDIA H100 doesn't.
|
| 59 |
+
|
| 60 |
+
**Diagram (build in Slides; described as bullets):**
|
| 61 |
+
```
|
| 62 |
+
[ Webcam frame burst (4 frames, 1.5 s) ]
|
| 63 |
+
β
|
| 64 |
+
βΌ
|
| 65 |
+
[ Qwen3-VL-8B ββ frame summariser, multi-image VLM call ]
|
| 66 |
+
β
|
| 67 |
+
βΌ
|
| 68 |
+
[ Llama-3.1-8B ββ sentence composer (sign tokens β English) ]
|
| 69 |
+
β
|
| 70 |
+
βΌ
|
| 71 |
+
[ Coqui XTTS-v2 ββ multilingual streaming TTS ]
|
| 72 |
+
β
|
| 73 |
+
βΌ
|
| 74 |
+
[ Audio out ββ speaker / Gradio audio component ]
|
| 75 |
+
```
|
| 76 |
+
|
| 77 |
+
**Comparison table (small print under diagram):**
|
| 78 |
+
|
| 79 |
+
| Component | Weights (FP16) | MI300X 1Γ (192 GB) | H100 80 GB |
|
| 80 |
+
|---|---|---|---|
|
| 81 |
+
| Qwen3-VL-8B | ~16 GB | β
fits | β
|
|
| 82 |
+
| Llama-3.1-8B | ~16 GB | β
fits | β
|
|
| 83 |
+
| XTTS-v2 + Whisper (V2) | ~5 GB | β
fits | β tight |
|
| 84 |
+
| (V2) **Llama-3.1-70B FP8 reasoner** | ~70 GB | **β
still fits** | **β doesn't fit at all** |
|
| 85 |
+
|
| 86 |
+
**Closer:** The single-GPU concurrency story is the AMD pitch.
|
| 87 |
+
|
| 88 |
+
*Visual: the diagram + table as a single composite slide. Use a brand colour for the AMD column to highlight.*
|
| 89 |
+
|
| 90 |
+
---
|
| 91 |
+
|
| 92 |
+
## Slide 5 β Live demo
|
| 93 |
+
|
| 94 |
+
**Headline:**
|
| 95 |
+
*(blank β this slide is the live demo)*
|
| 96 |
+
|
| 97 |
+
**Speaker note:**
|
| 98 |
+
Switch to the live HF Space at huggingface.co/spaces/lablab-ai-amd-developer-hackathon/signbridge. 30 seconds:
|
| 99 |
+
1. **Snapshot tab** β fingerspell L-U-C-A-S β click Speak β AI says "Lucas."
|
| 100 |
+
2. **Record sign tab** β record HELLO β click Submit β "hello" detected β click Speak β AI says "Hello."
|
| 101 |
+
|
| 102 |
+
If demo fails / network down β fall back to the pre-recorded 2-min video on slide 6.
|
| 103 |
+
|
| 104 |
+
*Visual: leave the slide blank or use a single QR code linking to the Space URL for the audience to scan and try themselves.*
|
| 105 |
+
|
| 106 |
+
---
|
| 107 |
+
|
| 108 |
+
## Slide 6 β Demo video (fallback)
|
| 109 |
+
|
| 110 |
+
**Headline:**
|
| 111 |
+
*(blank β this slide embeds the demo video)*
|
| 112 |
+
|
| 113 |
+
**Embed:**
|
| 114 |
+
The 2β3 minute demo video, looping, autoplay-on-slide-show.
|
| 115 |
+
|
| 116 |
+
*Visual: video player.*
|
| 117 |
+
|
| 118 |
+
---
|
| 119 |
+
|
| 120 |
+
## Slide 7 β Why this is the right submission for Track 3
|
| 121 |
+
|
| 122 |
+
**Headline:**
|
| 123 |
+
Four judging criteria, four deliberate choices.
|
| 124 |
+
|
| 125 |
+
**Two-column layout:**
|
| 126 |
+
|
| 127 |
+
| Judging criterion | Our choice |
|
| 128 |
+
|---|---|
|
| 129 |
+
| **Application of Technology** | Multi-modal pipeline (vision + reasoning + voice) running concurrently on a single MI300X β exactly what Track 3's "massive memory bandwidth of AMD GPUs" was for. |
|
| 130 |
+
| **Presentation** | Demo is *experienced*: judge holds phone, signs HELLO, hears "Hello." 30 seconds, no explanation needed. |
|
| 131 |
+
| **Business Value** | $4B+ existing market (Sorenson VRS comparable), legally-mandated interpretation budgets, open-source so any Deaf-led NGO / ministry / school can self-host on their own AMD compute. |
|
| 132 |
+
| **Originality** | Streaming continuous multi-frame VLM agent for sign language β no peer-reviewed benchmark exists for this approach yet (we checked the literature). Real ASL motion-words, not just fingerspelling. |
|
| 133 |
+
|
| 134 |
+
*Visual: 2Γ2 grid of icons, one per criterion.*
|
| 135 |
+
|
| 136 |
+
---
|
| 137 |
+
|
| 138 |
+
## Slide 8 β Substrate, not product Β· Open Β· Deaf-led future
|
| 139 |
+
|
| 140 |
+
**Headline:**
|
| 141 |
+
SignBridge is a substrate. Deaf-led teams are the deployers.
|
| 142 |
+
|
| 143 |
+
**Body:**
|
| 144 |
+
- **MIT-licensed**, code at github.com/seekerPrice/signbridge β anyone can self-host.
|
| 145 |
+
- **ASL only V1 is a scope decision.** BSL, MSL, CSL, ISL, +200 sign languages each deserve their own teams, training data, and Deaf community leadership. (Citing Bragg et al., *"Systemic Biases in Sign Language AI Research"*, arXiv 2403.02563.)
|
| 146 |
+
- **Privacy by default** β frames and audio are processed in-memory and not persisted server-side beyond the request lifetime.
|
| 147 |
+
|
| 148 |
+
**Closing line (large):**
|
| 149 |
+
The hardest part of accessibility isn't building. It's deploying. AMD makes the deploying possible.
|
| 150 |
+
|
| 151 |
+
*Visual: world map outline with sign-language regional dots; or just the SignBridge logo with the closing tagline.*
|
| 152 |
+
|
| 153 |
+
---
|
| 154 |
+
|
| 155 |
+
## Speaker-note tips (read these before recording)
|
| 156 |
+
|
| 157 |
+
1. **Lead with the human problem (Slide 2), not the architecture.** Architecture is for criterion 1; emotion is what closes criteria 2β4.
|
| 158 |
+
2. **Time the live demo** β 30 seconds max. If it fails, switch to fallback video without comment.
|
| 159 |
+
3. **Always say "AMD MI300X" by name** at least 3 times in the talk track. Sponsors notice.
|
| 160 |
+
4. **End on the substrate framing** β pre-empts the "savior tech" critique that Deaf-AI judges look out for.
|
| 161 |
+
|
| 162 |
+
---
|
| 163 |
+
|
| 164 |
+
## Export
|
| 165 |
+
|
| 166 |
+
Once filled in: File β Download β PDF document β upload to lablab.ai submission form's "Slide Presentation" field.
|