Spaces:
Sleeping
Sleeping
Commit Β·
4a7c575
1
Parent(s): 0e19ba2
add Dockerfile + HF Space frontmatter for hosted deploy
Browse filesMulti-stage Dockerfile: Node 22 + pnpm builds the frontend in stage 1,
Python 3.12 + CPU-only torch + sentence-transformers serves it via
FastAPI in stage 2. The backend serves the built dist/ as static files,
so it's one container, one process, one port.
requirements-docker.txt pins the PyTorch CPU wheel index so the build
doesn't pull ~2GB of unusable CUDA wheels on HF Spaces' free CPU instance.
The base requirements.txt stays platform-neutral for local conda dev.
README gets HF Space YAML frontmatter (sdk: docker, app_port: 7860) and
a hosting section walking through both local docker run and the HF push.
- .dockerignore +43 -0
- Dockerfile +72 -0
- README.md +50 -0
- requirements-docker.txt +12 -0
- requirements.txt +4 -0
.dockerignore
ADDED
|
@@ -0,0 +1,43 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
.git
|
| 2 |
+
.gitignore
|
| 3 |
+
.github
|
| 4 |
+
.vscode
|
| 5 |
+
.idea
|
| 6 |
+
.claude
|
| 7 |
+
.code-review-graph
|
| 8 |
+
.ruff_cache
|
| 9 |
+
.pre-commit-config.yaml
|
| 10 |
+
ruff.toml
|
| 11 |
+
|
| 12 |
+
# Keep build deterministic β rebuild indexes inside the container
|
| 13 |
+
data/vector_store/
|
| 14 |
+
data/pick_index/
|
| 15 |
+
data/faiss_store/
|
| 16 |
+
|
| 17 |
+
# Logs and ephemeral state
|
| 18 |
+
logs/
|
| 19 |
+
mlflow.db
|
| 20 |
+
mlruns/
|
| 21 |
+
*.csv
|
| 22 |
+
|
| 23 |
+
# Local dev artefacts
|
| 24 |
+
**/__pycache__/
|
| 25 |
+
**/*.pyc
|
| 26 |
+
**/*.pyo
|
| 27 |
+
**/.pytest_cache
|
| 28 |
+
**/.mypy_cache
|
| 29 |
+
|
| 30 |
+
# Frontend build artefacts (rebuilt in Docker stage 1)
|
| 31 |
+
frontend/node_modules/
|
| 32 |
+
frontend/dist/
|
| 33 |
+
|
| 34 |
+
# Misc
|
| 35 |
+
.DS_Store
|
| 36 |
+
*.swp
|
| 37 |
+
.env
|
| 38 |
+
ProjectDetails.pdf
|
| 39 |
+
docs/
|
| 40 |
+
|
| 41 |
+
README.md
|
| 42 |
+
lessons.md
|
| 43 |
+
references.md
|
Dockerfile
ADDED
|
@@ -0,0 +1,72 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# ββ Stage 1: build the React frontend ββββββββββββββββββββββββββββββββββββββββ
|
| 2 |
+
FROM node:22-slim AS frontend
|
| 3 |
+
|
| 4 |
+
WORKDIR /app/frontend
|
| 5 |
+
|
| 6 |
+
# pnpm via corepack (ships with Node 22)
|
| 7 |
+
RUN corepack enable
|
| 8 |
+
|
| 9 |
+
COPY frontend/package.json frontend/pnpm-lock.yaml ./
|
| 10 |
+
RUN pnpm install --frozen-lockfile
|
| 11 |
+
|
| 12 |
+
COPY frontend/ ./
|
| 13 |
+
RUN pnpm build
|
| 14 |
+
|
| 15 |
+
# ββ Stage 2: Python runtime ββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 16 |
+
FROM python:3.12-slim
|
| 17 |
+
|
| 18 |
+
# HF_HOME points at a writable cache dir for transformers/sentence-transformers.
|
| 19 |
+
# On HF Spaces the default $HOME is read-only at runtime, so we explicitly
|
| 20 |
+
# steer the model cache somewhere writable.
|
| 21 |
+
ENV PYTHONDONTWRITEBYTECODE=1 \
|
| 22 |
+
PYTHONUNBUFFERED=1 \
|
| 23 |
+
PIP_NO_CACHE_DIR=1 \
|
| 24 |
+
HF_HOME=/tmp/hf_cache \
|
| 25 |
+
XDG_CACHE_HOME=/tmp/.cache
|
| 26 |
+
|
| 27 |
+
WORKDIR /app
|
| 28 |
+
|
| 29 |
+
# System deps for torch + sentence-transformers (most are already in slim).
|
| 30 |
+
RUN apt-get update \
|
| 31 |
+
&& apt-get install -y --no-install-recommends \
|
| 32 |
+
build-essential \
|
| 33 |
+
curl \
|
| 34 |
+
&& rm -rf /var/lib/apt/lists/*
|
| 35 |
+
|
| 36 |
+
# Install Python deps via a Docker-specific requirements file that pins torch
|
| 37 |
+
# to the CPU-only wheel index. The base requirements.txt stays platform-neutral
|
| 38 |
+
# so local conda dev (./setup.sh) keeps using whatever torch flavor your OS
|
| 39 |
+
# wants (MPS on macOS, CUDA on Linux+GPU); this image deliberately uses CPU
|
| 40 |
+
# only because HF Spaces' free CPU instance can't use CUDA anyway.
|
| 41 |
+
COPY requirements.txt requirements-docker.txt ./
|
| 42 |
+
RUN pip install --upgrade pip \
|
| 43 |
+
&& pip install --retries 5 --timeout 120 -r requirements-docker.txt
|
| 44 |
+
|
| 45 |
+
# Copy the backend + persona source data.
|
| 46 |
+
COPY backend/ ./backend/
|
| 47 |
+
COPY data/memories/ ./data/memories/
|
| 48 |
+
COPY data/users.json ./data/users.json
|
| 49 |
+
COPY data/generate_users.py ./data/generate_users.py
|
| 50 |
+
|
| 51 |
+
# Build per-user vector indexes inside the image (downloads BGE on first run).
|
| 52 |
+
# This bakes the indexes into the image so first-request latency is just the
|
| 53 |
+
# model warm-up, not a fresh BGE encode of every persona.
|
| 54 |
+
RUN python -m backend.retrieval.vector_store
|
| 55 |
+
|
| 56 |
+
# Pull the built static frontend from stage 1.
|
| 57 |
+
COPY --from=frontend /app/frontend/dist ./frontend/dist
|
| 58 |
+
|
| 59 |
+
# Pre-create writable directories. HF Spaces filesystem is read-only outside
|
| 60 |
+
# /tmp at runtime, so logs default to /tmp; locally you can override LOGS_DIR
|
| 61 |
+
# via env to anything mounted/writable.
|
| 62 |
+
RUN mkdir -p /tmp/logs /tmp/hf_cache /tmp/.cache /tmp/pick_index \
|
| 63 |
+
&& chmod -R 777 /tmp/logs /tmp/hf_cache /tmp/.cache /tmp/pick_index
|
| 64 |
+
ENV LOGS_DIR=/tmp/logs
|
| 65 |
+
|
| 66 |
+
# HF Spaces expects 7860 by default; respects $PORT for local docker run.
|
| 67 |
+
ENV PORT=7860
|
| 68 |
+
EXPOSE 7860
|
| 69 |
+
|
| 70 |
+
# sh -c expands $PORT at runtime so the same image runs both on HF (port 7860,
|
| 71 |
+
# unset PORT or PORT=7860) and locally (e.g. `docker run -e PORT=8000 ...`).
|
| 72 |
+
CMD sh -c "uvicorn backend.api.main:app --host 0.0.0.0 --port ${PORT:-7860}"
|
README.md
CHANGED
|
@@ -1,3 +1,14 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# Multimodal AAC Chatbot
|
| 2 |
|
| 3 |
A chatbot that **speaks as an AAC user, not to them.** You pick a persona β fourteen are shipped, anchored in real memoirs and canonical fiction β and the partner talks to them. The bot replies in that person's voice, using their memories, and adjusts what it says based on what the webcam sees: facial expression, hand gestures, where they're looking, and letters they trace in the air.
|
|
@@ -14,6 +25,7 @@ It's a training-free agentic RAG pipeline β a plain Python function chain with
|
|
| 14 |
- [Setup](#setup)
|
| 15 |
- [Configuration](#configuration)
|
| 16 |
- [Running the Project](#running-the-project)
|
|
|
|
| 17 |
- [Project Structure](#project-structure)
|
| 18 |
- [Personas](#personas)
|
| 19 |
- [Team](#team)
|
|
@@ -318,6 +330,44 @@ Output covers latency quantiles + SLO pass rate, faithfulness (groundedness / ha
|
|
| 318 |
|
| 319 |
---
|
| 320 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 321 |
## Project Structure
|
| 322 |
|
| 323 |
```
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: Multimodal AAC Chatbot
|
| 3 |
+
emoji: πΈ
|
| 4 |
+
colorFrom: pink
|
| 5 |
+
colorTo: indigo
|
| 6 |
+
sdk: docker
|
| 7 |
+
app_port: 7860
|
| 8 |
+
pinned: false
|
| 9 |
+
license: other
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
# Multimodal AAC Chatbot
|
| 13 |
|
| 14 |
A chatbot that **speaks as an AAC user, not to them.** You pick a persona β fourteen are shipped, anchored in real memoirs and canonical fiction β and the partner talks to them. The bot replies in that person's voice, using their memories, and adjusts what it says based on what the webcam sees: facial expression, hand gestures, where they're looking, and letters they trace in the air.
|
|
|
|
| 25 |
- [Setup](#setup)
|
| 26 |
- [Configuration](#configuration)
|
| 27 |
- [Running the Project](#running-the-project)
|
| 28 |
+
- [Hosting](#hosting)
|
| 29 |
- [Project Structure](#project-structure)
|
| 30 |
- [Personas](#personas)
|
| 31 |
- [Team](#team)
|
|
|
|
| 330 |
|
| 331 |
---
|
| 332 |
|
| 333 |
+
## Hosting
|
| 334 |
+
|
| 335 |
+
The project ships with a single [Dockerfile](Dockerfile) that builds the React frontend in stage 1 (Node 22 + pnpm) and runs the FastAPI backend in stage 2 (Python 3.12 + torch + sentence-transformers). The backend serves the built `frontend/dist/` as static files, so it's one container, one process, one port.
|
| 336 |
+
|
| 337 |
+
The same image runs identically in two places.
|
| 338 |
+
|
| 339 |
+
### Locally (for development that mirrors production)
|
| 340 |
+
|
| 341 |
+
```bash
|
| 342 |
+
docker build -t aac-chatbot .
|
| 343 |
+
docker run --rm -p 8000:8000 -e PORT=8000 --env-file .env aac-chatbot
|
| 344 |
+
# β http://localhost:8000
|
| 345 |
+
```
|
| 346 |
+
|
| 347 |
+
The `--env-file .env` injects your Ollama Cloud key + endpoints (same `.env` you use for `./run.sh`). Conda + `./run.sh` is still the fastest dev loop because it hot-reloads; the docker path is for when you want byte-identical-to-production behaviour.
|
| 348 |
+
|
| 349 |
+
### On Hugging Face Spaces (public URL for graders)
|
| 350 |
+
|
| 351 |
+
The repo doubles as an HF Space β `README.md` carries the YAML frontmatter HF needs (`sdk: docker`, `app_port: 7860`).
|
| 352 |
+
|
| 353 |
+
1. Create a new Space on huggingface.co (Docker SDK, public).
|
| 354 |
+
2. Add this repo as a remote:
|
| 355 |
+
```bash
|
| 356 |
+
git remote add space https://huggingface.co/spaces/<your-username>/aac-chatbot
|
| 357 |
+
git push space main
|
| 358 |
+
```
|
| 359 |
+
3. In the Space's *Settings β Variables and secrets*, add the LLM-tier secrets (don't commit them):
|
| 360 |
+
- `PRIMARY_API_KEY`, `PRIMARY_BASE_URL`, `PRIMARY_MODEL`
|
| 361 |
+
- `FALLBACK_API_KEY`, `FALLBACK_BASE_URL`, `FALLBACK_MODEL`
|
| 362 |
+
- `INK_VISION_API_KEY`, `INK_VISION_BASE_URL`, `INK_VISION_MODEL`
|
| 363 |
+
4. The Space rebuilds the Dockerfile on every push. First build takes ~5-8 min (downloads BGE + builds vector indexes for all personas); subsequent builds reuse Docker layer cache and finish in 2-3 min.
|
| 364 |
+
|
| 365 |
+
The deployed instance won't persist `logs/` or `data/pick_index/` across container restarts (HF Spaces filesystem is read-only outside `/tmp`). For the writeup, your local logs are the source of truth β the Space is just a click-around demo for graders.
|
| 366 |
+
|
| 367 |
+
**Webcam note.** `getUserMedia` requires HTTPS, which both HF Spaces and `localhost` provide. Random IP addresses don't, so don't try to demo from a LAN IP without a tunnel.
|
| 368 |
+
|
| 369 |
+
---
|
| 370 |
+
|
| 371 |
## Project Structure
|
| 372 |
|
| 373 |
```
|
requirements-docker.txt
ADDED
|
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Docker / HF Space install β uses CPU-only torch to avoid pulling ~2GB of
|
| 2 |
+
# CUDA wheels we can't use on HF Spaces' free CPU instance.
|
| 3 |
+
#
|
| 4 |
+
# The PyTorch CPU index serves wheels with a `+cpu` local-version tag.
|
| 5 |
+
# Listing it FIRST (via --index-url) makes pip prefer those wheels for torch;
|
| 6 |
+
# the PyPI fallback covers everything else.
|
| 7 |
+
--index-url https://download.pytorch.org/whl/cpu
|
| 8 |
+
--extra-index-url https://pypi.org/simple
|
| 9 |
+
|
| 10 |
+
# Everything from the standard requirements file. Torch will resolve to the
|
| 11 |
+
# CPU wheel from the index above; the rest from PyPI.
|
| 12 |
+
-r requirements.txt
|
requirements.txt
CHANGED
|
@@ -3,6 +3,10 @@ openai>=1.0 # talks to Ollama Cloud over OpenAI-compatible HTTP
|
|
| 3 |
|
| 4 |
# ββ Retrieval ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 5 |
sentence-transformers>=3.0
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
torch>=2.0
|
| 7 |
transformers>=4.40
|
| 8 |
numpy>=1.24
|
|
|
|
| 3 |
|
| 4 |
# ββ Retrieval ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 5 |
sentence-transformers>=3.0
|
| 6 |
+
# torch is installed separately in the Dockerfile from the CPU-only wheel
|
| 7 |
+
# index. For local conda dev (./setup.sh) we use the default platform torch
|
| 8 |
+
# (which on macOS gives MPS acceleration). The constraint is satisfied either
|
| 9 |
+
# way; the comment is here so a fresh reader doesn't try to pin it.
|
| 10 |
torch>=2.0
|
| 11 |
transformers>=4.40
|
| 12 |
numpy>=1.24
|