Spaces:

LucasLooTan
/

signbridge

Build error

LucasLooTan commited on 1 day ago

Commit

8b64ea8

1 Parent(s): 18d028b

chore: drop Build-in-Public extra challenge

User direction 2026-05-07: focus on Track 3 (Vision & Multimodal AI)
only. Removes the 2 social-post obligation and the external-facing
walkthrough framing. Walkthrough kept as an internal technical record.

Frees ~2 hours of Day 1 + Day 3 effort that was earmarked for tweets
and walkthrough finalisation.

Updated:
- CLAUDE.md: Status / Goal / Success criteria / References / Progress log
- README.md: track line
- docs/walkthrough.md: re-framed as internal-only

Files changed (3) hide show

CLAUDE.md +8 -6
README.md +1 -1
docs/walkthrough.md +4 -3

CLAUDE.md CHANGED Viewed

@@ -14,19 +14,19 @@ Loaded when the working directory is inside `/Users/lucaslt/Documents/side-gig/a
 ## Status
-Day 1 / ~4 — pivoted from Iris to SignBridge on 2026-05-07. **Submission deadline: 2026-05-11 03:00 MYT.** ~3.5 days remaining. AMD Developer Hackathon, Track 3 (Vision & Multimodal AI) + Extra Challenge (Build in Public). Currently scaffolding + Day 1 hello-world.
 ## Goal
-Win the AMD Developer Hackathon (LabLab.ai, May 2026), Track 3 + Extra Challenge, with a real-time webcam-based ASL → English speech translator. A deaf person signs → AI speaks. The demo IS the project: judges literally see two people who couldn't communicate, now do.
 ### Success criteria
-- Submission accepted by 2026-05-11 03:00 MYT — live HF Space (Gradio) URL + open-source GitHub mirror (Apache 2.0) + HF Hub trained-classifier artifact + technical walkthrough markdown + 2–3 min demo video + 2 social posts during build.
-- End-to-end working flow: webcam frame stream → MediaPipe Holistic → trained sign classifier → Llama-3.1-8B sentence composer → Coqui XTTS-v2 → speech output. **≤ 2 s** from end-of-sign to start of speech.
 - V1 use cases: (1) ASL fingerspelling alphabet A–Z + 0–9, (2) Top-50 WLASL signs (hello, thank you, name, please, …). Target ≥ 75% accuracy on a 30-sample gold set.
 - Reverse direction (speech → on-screen text for the deaf user) is a **stretch** for the buffer day only.
-- Track 3 + Build-in-Public: top-3 finish at minimum; gold target.
 ---
@@ -142,7 +142,7 @@ git push huggingface main
 - **Working dir:** `/Users/lucaslt/Documents/side-gig/amd-hackathon/`
 - **Hackathon page:** https://lablab.ai/ai-hackathons/amd-developer
 - **AMD article:** https://www.amd.com/en/developer/resources/technical-articles/2026/build-across-the-ai-stack--join-the-amd-x-lablab-ai-hackathon-.html
-- **Track:** 3 (Vision & Multimodal AI) + Extra Challenge (Ship It + Build in Public)
 - **WLASL dataset:** https://github.com/dxli94/WLASL
 - **MediaPipe Holistic:** https://developers.google.com/mediapipe/solutions/vision/holistic_landmarker
 - **HF Hub org / Space:** *fill in once created*
@@ -154,6 +154,8 @@ git push huggingface main
 ## Progress log (newest first)
 **2026-05-07 — Pivoted to SignBridge.** Re-scored against the four judging criteria: SignBridge wins on Originality (10) and Presentation (10) thanks to the live deaf-person-to-hearing-person demo. Business value also stronger (Sorenson VRS comparable, mandated interpreter budgets). Replaced Iris scaffold (`iris/` package, README, requirements deps) with `signbridge/` package. CLAUDE.md, plan file, README rewritten. Day 1 hello-world starts: MediaPipe Holistic on webcam, WLASL data download, Plan-B VLM test.
 **2026-05-07 — Initial Iris scaffold (deprecated).** Bootstrapped repo with Iris (visually-impaired navigation) plan, requirements.txt, .gitignore, .env.example, README. Replaced same-day after re-evaluation; kept reusable pieces (.gitignore, structural choices).

 ## Status
+Day 1 / ~4 — pivoted from Iris to SignBridge on 2026-05-07. **Submission deadline: 2026-05-11 03:00 MYT.** ~3.5 days remaining. AMD Developer Hackathon, **Track 3 — Vision & Multimodal AI** (only — Build-in-Public dropped 2026-05-07). Currently scaffolding + Day 1 hello-world.
 ## Goal
+Win the AMD Developer Hackathon (LabLab.ai, May 2026), Track 3, with a real-time webcam-based ASL → English speech translator. A deaf person signs → AI speaks. The demo IS the project: judges literally see two people who couldn't communicate, now do.
 ### Success criteria
+- Submission accepted by 2026-05-11 03:00 MYT — live HF Space (Gradio) URL + 2–3 min demo video + lablab.ai submission form complete.
+- End-to-end working flow: webcam frame → VLM recognizer → Llama-3.1-8B sentence composer → Coqui XTTS-v2 → speech output. **≤ 2 s** from capture to start of speech.
 - V1 use cases: (1) ASL fingerspelling alphabet A–Z + 0–9, (2) Top-50 WLASL signs (hello, thank you, name, please, …). Target ≥ 75% accuracy on a 30-sample gold set.
 - Reverse direction (speech → on-screen text for the deaf user) is a **stretch** for the buffer day only.
+- Track 3: top-3 finish at minimum; gold target.
 ---
 - **Working dir:** `/Users/lucaslt/Documents/side-gig/amd-hackathon/`
 - **Hackathon page:** https://lablab.ai/ai-hackathons/amd-developer
 - **AMD article:** https://www.amd.com/en/developer/resources/technical-articles/2026/build-across-the-ai-stack--join-the-amd-x-lablab-ai-hackathon-.html
+- **Track:** 3 (Vision & Multimodal AI). Extra Challenge (Build in Public) intentionally skipped 2026-05-07.
 - **WLASL dataset:** https://github.com/dxli94/WLASL
 - **MediaPipe Holistic:** https://developers.google.com/mediapipe/solutions/vision/holistic_landmarker
 - **HF Hub org / Space:** *fill in once created*
 ## Progress log (newest first)
+**2026-05-07 — Dropped Build-in-Public extra challenge.** Track 3 only. Frees ~2 hours that were earmarked for the 2 social posts + the external-facing walkthrough framing. Walkthrough doc kept as an internal technical record but no longer a submission deliverable.
 **2026-05-07 — Pivoted to SignBridge.** Re-scored against the four judging criteria: SignBridge wins on Originality (10) and Presentation (10) thanks to the live deaf-person-to-hearing-person demo. Business value also stronger (Sorenson VRS comparable, mandated interpreter budgets). Replaced Iris scaffold (`iris/` package, README, requirements deps) with `signbridge/` package. CLAUDE.md, plan file, README rewritten. Day 1 hello-world starts: MediaPipe Holistic on webcam, WLASL data download, Plan-B VLM test.
 **2026-05-07 — Initial Iris scaffold (deprecated).** Bootstrapped repo with Iris (visually-impaired navigation) plan, requirements.txt, .gitignore, .env.example, README. Replaced same-day after re-evaluation; kept reusable pieces (.gitignore, structural choices).

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ Two people who couldn't communicate, now can.
 A deaf person signs into the webcam. SignBridge — a multi-stage vision + reasoning + voice pipeline running on a single AMD Instinct MI300X — translates the signs into spoken English in under 2 seconds.
-Submission for the **AMD Developer Hackathon** (LabLab.ai, May 2026), **Track 3 — Vision & Multimodal AI**.
 ## How it works

 A deaf person signs into the webcam. SignBridge — a multi-stage vision + reasoning + voice pipeline running on a single AMD Instinct MI300X — translates the signs into spoken English in under 2 seconds.
+Submission for the **AMD Developer Hackathon** (LabLab.ai, May 2026) — **Track 3: Vision & Multimodal AI**.
 ## How it works

docs/walkthrough.md CHANGED Viewed

@@ -1,8 +1,9 @@
 # SignBridge — technical walkthrough
-> Draft. Filled in across the build window. This file doubles as the
-> "Build in Public — feedback on building with ROCm / AMD Developer Cloud"
-> submission deliverable.
 ## What we built

 # SignBridge — technical walkthrough
+> Internal technical record of the build. Not a submission deliverable
+> (Build-in-Public extra challenge was dropped on 2026-05-07).
+> Kept around because it documents the AMD-specific engineering thinking
+> and is useful if anyone later asks "why these design choices?".
 ## What we built