Spaces:
Build error
Build error
Commit Β·
8b64ea8
1
Parent(s): 18d028b
chore: drop Build-in-Public extra challenge
Browse filesUser direction 2026-05-07: focus on Track 3 (Vision & Multimodal AI)
only. Removes the 2 social-post obligation and the external-facing
walkthrough framing. Walkthrough kept as an internal technical record.
Frees ~2 hours of Day 1 + Day 3 effort that was earmarked for tweets
and walkthrough finalisation.
Updated:
- CLAUDE.md: Status / Goal / Success criteria / References / Progress log
- README.md: track line
- docs/walkthrough.md: re-framed as internal-only
- CLAUDE.md +8 -6
- README.md +1 -1
- docs/walkthrough.md +4 -3
CLAUDE.md
CHANGED
|
@@ -14,19 +14,19 @@ Loaded when the working directory is inside `/Users/lucaslt/Documents/side-gig/a
|
|
| 14 |
|
| 15 |
## Status
|
| 16 |
|
| 17 |
-
Day 1 / ~4 β pivoted from Iris to SignBridge on 2026-05-07. **Submission deadline: 2026-05-11 03:00 MYT.** ~3.5 days remaining. AMD Developer Hackathon, Track 3
|
| 18 |
|
| 19 |
## Goal
|
| 20 |
|
| 21 |
-
Win the AMD Developer Hackathon (LabLab.ai, May 2026), Track 3
|
| 22 |
|
| 23 |
### Success criteria
|
| 24 |
|
| 25 |
-
- Submission accepted by 2026-05-11 03:00 MYT β live HF Space (Gradio) URL +
|
| 26 |
-
- End-to-end working flow: webcam frame
|
| 27 |
- V1 use cases: (1) ASL fingerspelling alphabet AβZ + 0β9, (2) Top-50 WLASL signs (hello, thank you, name, please, β¦). Target β₯ 75% accuracy on a 30-sample gold set.
|
| 28 |
- Reverse direction (speech β on-screen text for the deaf user) is a **stretch** for the buffer day only.
|
| 29 |
-
- Track 3
|
| 30 |
|
| 31 |
---
|
| 32 |
|
|
@@ -142,7 +142,7 @@ git push huggingface main
|
|
| 142 |
- **Working dir:** `/Users/lucaslt/Documents/side-gig/amd-hackathon/`
|
| 143 |
- **Hackathon page:** https://lablab.ai/ai-hackathons/amd-developer
|
| 144 |
- **AMD article:** https://www.amd.com/en/developer/resources/technical-articles/2026/build-across-the-ai-stack--join-the-amd-x-lablab-ai-hackathon-.html
|
| 145 |
-
- **Track:** 3 (Vision & Multimodal AI)
|
| 146 |
- **WLASL dataset:** https://github.com/dxli94/WLASL
|
| 147 |
- **MediaPipe Holistic:** https://developers.google.com/mediapipe/solutions/vision/holistic_landmarker
|
| 148 |
- **HF Hub org / Space:** *fill in once created*
|
|
@@ -154,6 +154,8 @@ git push huggingface main
|
|
| 154 |
|
| 155 |
## Progress log (newest first)
|
| 156 |
|
|
|
|
|
|
|
| 157 |
**2026-05-07 β Pivoted to SignBridge.** Re-scored against the four judging criteria: SignBridge wins on Originality (10) and Presentation (10) thanks to the live deaf-person-to-hearing-person demo. Business value also stronger (Sorenson VRS comparable, mandated interpreter budgets). Replaced Iris scaffold (`iris/` package, README, requirements deps) with `signbridge/` package. CLAUDE.md, plan file, README rewritten. Day 1 hello-world starts: MediaPipe Holistic on webcam, WLASL data download, Plan-B VLM test.
|
| 158 |
|
| 159 |
**2026-05-07 β Initial Iris scaffold (deprecated).** Bootstrapped repo with Iris (visually-impaired navigation) plan, requirements.txt, .gitignore, .env.example, README. Replaced same-day after re-evaluation; kept reusable pieces (.gitignore, structural choices).
|
|
|
|
| 14 |
|
| 15 |
## Status
|
| 16 |
|
| 17 |
+
Day 1 / ~4 β pivoted from Iris to SignBridge on 2026-05-07. **Submission deadline: 2026-05-11 03:00 MYT.** ~3.5 days remaining. AMD Developer Hackathon, **Track 3 β Vision & Multimodal AI** (only β Build-in-Public dropped 2026-05-07). Currently scaffolding + Day 1 hello-world.
|
| 18 |
|
| 19 |
## Goal
|
| 20 |
|
| 21 |
+
Win the AMD Developer Hackathon (LabLab.ai, May 2026), Track 3, with a real-time webcam-based ASL β English speech translator. A deaf person signs β AI speaks. The demo IS the project: judges literally see two people who couldn't communicate, now do.
|
| 22 |
|
| 23 |
### Success criteria
|
| 24 |
|
| 25 |
+
- Submission accepted by 2026-05-11 03:00 MYT β live HF Space (Gradio) URL + 2β3 min demo video + lablab.ai submission form complete.
|
| 26 |
+
- End-to-end working flow: webcam frame β VLM recognizer β Llama-3.1-8B sentence composer β Coqui XTTS-v2 β speech output. **β€ 2 s** from capture to start of speech.
|
| 27 |
- V1 use cases: (1) ASL fingerspelling alphabet AβZ + 0β9, (2) Top-50 WLASL signs (hello, thank you, name, please, β¦). Target β₯ 75% accuracy on a 30-sample gold set.
|
| 28 |
- Reverse direction (speech β on-screen text for the deaf user) is a **stretch** for the buffer day only.
|
| 29 |
+
- Track 3: top-3 finish at minimum; gold target.
|
| 30 |
|
| 31 |
---
|
| 32 |
|
|
|
|
| 142 |
- **Working dir:** `/Users/lucaslt/Documents/side-gig/amd-hackathon/`
|
| 143 |
- **Hackathon page:** https://lablab.ai/ai-hackathons/amd-developer
|
| 144 |
- **AMD article:** https://www.amd.com/en/developer/resources/technical-articles/2026/build-across-the-ai-stack--join-the-amd-x-lablab-ai-hackathon-.html
|
| 145 |
+
- **Track:** 3 (Vision & Multimodal AI). Extra Challenge (Build in Public) intentionally skipped 2026-05-07.
|
| 146 |
- **WLASL dataset:** https://github.com/dxli94/WLASL
|
| 147 |
- **MediaPipe Holistic:** https://developers.google.com/mediapipe/solutions/vision/holistic_landmarker
|
| 148 |
- **HF Hub org / Space:** *fill in once created*
|
|
|
|
| 154 |
|
| 155 |
## Progress log (newest first)
|
| 156 |
|
| 157 |
+
**2026-05-07 β Dropped Build-in-Public extra challenge.** Track 3 only. Frees ~2 hours that were earmarked for the 2 social posts + the external-facing walkthrough framing. Walkthrough doc kept as an internal technical record but no longer a submission deliverable.
|
| 158 |
+
|
| 159 |
**2026-05-07 β Pivoted to SignBridge.** Re-scored against the four judging criteria: SignBridge wins on Originality (10) and Presentation (10) thanks to the live deaf-person-to-hearing-person demo. Business value also stronger (Sorenson VRS comparable, mandated interpreter budgets). Replaced Iris scaffold (`iris/` package, README, requirements deps) with `signbridge/` package. CLAUDE.md, plan file, README rewritten. Day 1 hello-world starts: MediaPipe Holistic on webcam, WLASL data download, Plan-B VLM test.
|
| 160 |
|
| 161 |
**2026-05-07 β Initial Iris scaffold (deprecated).** Bootstrapped repo with Iris (visually-impaired navigation) plan, requirements.txt, .gitignore, .env.example, README. Replaced same-day after re-evaluation; kept reusable pieces (.gitignore, structural choices).
|
README.md
CHANGED
|
@@ -17,7 +17,7 @@ Two people who couldn't communicate, now can.
|
|
| 17 |
|
| 18 |
A deaf person signs into the webcam. SignBridge β a multi-stage vision + reasoning + voice pipeline running on a single AMD Instinct MI300X β translates the signs into spoken English in under 2 seconds.
|
| 19 |
|
| 20 |
-
Submission for the **AMD Developer Hackathon** (LabLab.ai, May 2026)
|
| 21 |
|
| 22 |
## How it works
|
| 23 |
|
|
|
|
| 17 |
|
| 18 |
A deaf person signs into the webcam. SignBridge β a multi-stage vision + reasoning + voice pipeline running on a single AMD Instinct MI300X β translates the signs into spoken English in under 2 seconds.
|
| 19 |
|
| 20 |
+
Submission for the **AMD Developer Hackathon** (LabLab.ai, May 2026) β **Track 3: Vision & Multimodal AI**.
|
| 21 |
|
| 22 |
## How it works
|
| 23 |
|
docs/walkthrough.md
CHANGED
|
@@ -1,8 +1,9 @@
|
|
| 1 |
# SignBridge β technical walkthrough
|
| 2 |
|
| 3 |
-
>
|
| 4 |
-
>
|
| 5 |
-
>
|
|
|
|
| 6 |
|
| 7 |
## What we built
|
| 8 |
|
|
|
|
| 1 |
# SignBridge β technical walkthrough
|
| 2 |
|
| 3 |
+
> Internal technical record of the build. Not a submission deliverable
|
| 4 |
+
> (Build-in-Public extra challenge was dropped on 2026-05-07).
|
| 5 |
+
> Kept around because it documents the AMD-specific engineering thinking
|
| 6 |
+
> and is useful if anyone later asks "why these design choices?".
|
| 7 |
|
| 8 |
## What we built
|
| 9 |
|