File size: 11,784 Bytes
3385e0e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 | # Hugging Face Spaces Deployment
How we deployed the PaperHawk Streamlit application as a public Hugging Face Space, with the AMD MI300X vLLM endpoint as its inference backend.
---
## What you get
- **Public Space URL** β a Streamlit app anyone can use in a browser, no signup
- **Free CPU Basic tier** β 16 GB RAM, 2 vCPU. The app runs here; the LLM runs on AMD MI300X via vLLM (separate Cloud).
- **Two paths**: under the `lablab-ai-amd-developer-hackathon` org (Plan A β qualifies for HF Special Prize), or under your personal account (Plan B β fallback if the org has hardware-quota issues)
Live example: <https://huggingface.co/spaces/Vincsipe/paperhawk>
---
## Prerequisites
1. Hugging Face account (free)
2. **Optional**: membership in the `lablab-ai-amd-developer-hackathon` org if submitting to the AMD Developer Hackathon (Plan A). The HF Special Prize requires the Space to live under this org.
3. A running vLLM endpoint on AMD MI300X β see [`AMD_DEPLOYMENT.md`](AMD_DEPLOYMENT.md)
4. The PaperHawk repo cloned locally with `Dockerfile`, `README.md`, and `app/main.py`
---
## Step 1 β Create the Space
Go to <https://huggingface.co/new-space> (or, if you're an org member, click `+ New` β `New Space` from the org page).
**Configuration**:
| Field | Value |
|---|---|
| Owner | `lablab-ai-amd-developer-hackathon` (Plan A) or your personal handle (Plan B) |
| Space name | `paperhawk` |
| Short description | `Real-DI-Audit/14 rules/6 anti-halluc/LangGraph/Qwen/MI300X` |
| License | `mit` |
| **Space SDK** | **Docker** (not Streamlit, not Gradio β see step 2) |
| **Template** | **Blank** (we ship our own Dockerfile) |
| Hardware | CPU Basic (free, 16 GB RAM) |
| Visibility | Public (required for the HF Special Prize) |
Click **Create Space**. You'll get an empty repo at:
```
https://huggingface.co/spaces/<owner>/paperhawk
```
**Why Docker SDK and not Streamlit-template?** As of 2026, the HF Spaces "Streamlit" SDK lives under the Docker tab as a managed template. We bypass the template because PaperHawk needs custom OS dependencies (Tesseract OCR for EN/HU/DE, poppler-utils for table extraction, libmupdf for PDFs) that the templated builder doesn't include. Our own Dockerfile is faster to debug and gives us a deterministic base image.
---
## Step 2 β Configure the Dockerfile for HF Spaces
The PaperHawk Dockerfile is HF-Spaces-ready out of the box, with one critical detail: **port 7860**.
```dockerfile
# syntax=docker/dockerfile:1.6
FROM python:3.12-slim AS base
ENV PYTHONUNBUFFERED=1 PYTHONDONTWRITEBYTECODE=1
# OS deps
RUN apt-get update && apt-get install -y --no-install-recommends \
tesseract-ocr tesseract-ocr-eng tesseract-ocr-hun tesseract-ocr-deu \
poppler-utils libmupdf-dev curl \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY requirements.txt .
RUN pip install --upgrade pip \
&& pip install --index-url https://download.pytorch.org/whl/cpu torch \
&& pip install -r requirements.txt
# Pre-download the embedding model so the first user request isn't slow
RUN python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('BAAI/bge-m3')"
COPY . .
# HF Spaces expects port 7860 (NOT Streamlit's default 8501)
EXPOSE 7860
CMD ["streamlit", "run", "app/main.py", \
"--server.address=0.0.0.0", \
"--server.port=7860", \
"--server.headless=true"]
```
**Why 7860?** HF Spaces' Docker hosting only routes traffic to port 7860 β the Streamlit default 8501 is invisible to the public URL. This is a one-line fix that's easy to miss.
---
## Step 3 β Configure the README YAML front-matter
HF Spaces reads the YAML block at the top of `README.md` to configure the Space card and build behavior. PaperHawk's:
```yaml
---
title: PaperHawk
emoji: π¦
colorFrom: red
colorTo: yellow
sdk: docker
pinned: false
license: mit
short_description: Real-DI-Audit/14 rules/6 anti-halluc/LangGraph/Qwen/MI300X
---
```
**Critical**: `colorTo` must be one of `[red, yellow, green, blue, indigo, purple, pink, gray]`. We initially used `orange` (because the AMD brand color is orange) β HF rejected the YAML as invalid, and the Space card fell back to a generic theme **with the YAML rendered as a Markdown table at the top of the page**. Fixed by changing to `yellow`.
If the Space's main page shows a `title | PaperHawk` table at the top, the YAML is invalid and HF can't parse it β check the `colorTo` value first.
---
## Step 4 β Set up Git LFS for binary assets
HF Spaces has a strict rule: every binary file (`*.png`, `*.pdf`, `*.pptx`, `*.docx`, `*.jpg`, `*.mp4`) must live in **Xet storage** via Git LFS, not as a regular Git blob. The cover PNG, the slide PDF, the demo packages β all of these get rejected without LFS.
On your local machine:
```bash
# One-time, in any repo with binary files
sudo apt install git-lfs # or `brew install git-lfs` on macOS
git lfs install
```
In the PaperHawk repo:
```bash
git lfs track "*.png" "*.pdf" "*.pptx" "*.docx" "*.jpeg" "*.jpg" "*.mp4"
git add .gitattributes
git commit -m "Track binary files via LFS"
```
**Important**: `git lfs track` only updates `.gitattributes`. Existing commits with binaries-as-Git-blob are still rejected by HF. Migrate the entire history:
```bash
git lfs migrate import --include="*.png,*.pdf,*.pptx,*.docx,*.jpeg,*.jpg,*.mp4"
```
This rewrites the HEAD commit so the binaries are LFS-blobs. New `git push` will upload them via Xet.
**Files over 10 MB**: HF Spaces also enforces a 10 MB hard limit per file even via LFS for the free Spaces tier. Any single video over 10 MB will be rejected. If you have demo videos, keep them as separate uploads on YouTube/Vimeo and link from the Space description.
---
## Step 5 β Add the Space as a git remote and push
```bash
# Add a remote for the Space (token embedded in URL avoids dual auth-prompts)
HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxxxxxx # generate at https://huggingface.co/settings/tokens (Write scope, fine-grained, with org access if Plan A)
git remote add space https://<your-hf-username>:${HF_TOKEN}@huggingface.co/spaces/<owner>/paperhawk
# Push to the Space
git push --force space main
```
**Why token in URL?** Git LFS uses a separate authentication channel from the regular Git push. Without the token in the URL, Git prompts for credentials twice and one of them silently times out. Putting the token in the URL handles both.
The first push uploads ~9 MB of LFS objects (the cover image, slide PDF, sample PDFs, sample DOCX). Subsequent pushes are fast (cached on HF's side).
---
## Step 6 β Add Space secrets
The app reads its LLM provider config from environment variables. In the Space:
**Settings** (top-right, on the Space page) β **Variables and secrets** β **+ New variable** for each:
| Key | Value | Type |
|---|---|---|
| `LLM_PROFILE` | `vllm` | Variable |
| `VLLM_BASE_URL` | `http://<MI300X_DROPLET_IP>:8000/v1` | Variable |
| `VLLM_MODEL` | `Qwen/Qwen2.5-14B-Instruct` | Variable |
| `EMBEDDING_MODEL` | `BAAI/bge-m3` | Variable |
| `VLLM_API_KEY` | `sk-paperhawk-2026` (the same token you passed to vLLM `--api-key`) | **Secret** |
The `VLLM_API_KEY` must be a **Secret**, not a Variable β Secrets are masked in the UI and not exposed via the public Space metadata.
After saving, the Space rebuilds automatically (~5 minutes for first build, faster for subsequent).
---
## Step 7 β Wait for the build, then verify
The first build pulls and installs everything β Python 3.12-slim, OS deps, PyTorch CPU wheel, the BAAI/bge-m3 model (~2.3 GB pre-download), and the rest of `requirements.txt`. Expect 8β15 minutes for the cold build.
Watch the build logs in the Space β **Logs** tab. When you see `streamlit run app/main.py` and `You can now view your Streamlit app in your browser` the Space is up.
Open the Space URL in a browser and click **Audit Demo**. If the vLLM endpoint is reachable, you'll see results in 20β25 seconds.
If you get an error like `Connection refused` or a long hang, check:
1. The MI300X droplet is running and `vllm serve` is up (SSH in, look at the SSH window from `AMD_DEPLOYMENT.md` step 6)
2. The droplet's UFW has port 8000 open (`ufw status | grep 8000` from the droplet)
3. The `VLLM_BASE_URL` in Space Secrets matches the droplet's current public IP (which changes on every recreate-from-snapshot)
---
## Step 8 β Hide the YAML from the GitHub display (optional)
The YAML front-matter is needed for HF Spaces but **looks ugly on GitHub** β the renderer shows it as a `key | value` table at the top of the README, with no formatting.
Workaround: GitHub honors `.github/README.md` over the root `README.md` for the public repo display. We commit a copy of the README **without** the YAML block as `.github/README.md`:
```bash
mkdir -p .github
tail -n +12 README.md > .github/README.md # skip the first 11 lines (the YAML + blank line)
# (optionally edit .github/README.md to use absolute raw-image URLs for paperhawk.jpeg)
git add .github/README.md
git commit -m "Add .github/README.md to hide HF YAML on GitHub display"
git push origin main
```
Now GitHub shows `.github/README.md` (clean), and HF Spaces still reads the root `README.md` (with YAML). One file, two faces.
---
## Plan A vs Plan B
| Aspect | Plan A (org Space) | Plan B (personal Space) |
|---|---|---|
| Owner | `lablab-ai-amd-developer-hackathon/paperhawk` | `<your-handle>/paperhawk` |
| HF Special Prize | β
Qualifies | β Disqualifies |
| Org-quota dependency | β οΈ Yes (shared with other org Spaces) | β Independent |
| Visibility | Public, on the org page | Public, on your profile |
| Setup steps | Same as above | Same as above |
If the org-quota is exhausted (we hit `null quota limit` 403 errors), the same code, same Dockerfile, same YAML, same env-var setup pushes to a personal Space and runs immediately. This was our Plan B safety net during the hackathon.
---
## Common pitfalls
- **"Build failed: app port 7860 not reachable"**: Your Dockerfile is binding to a different port (probably Streamlit's default 8501). Change `EXPOSE` and `CMD` to use 7860.
- **YAML rendered as a Markdown table on the Space main page**: The YAML is invalid. Most likely culprits: invalid `colorTo` (allowed: red/yellow/green/blue/indigo/purple/pink/gray, **not** orange), invalid `sdk`, missing `---` opening line, BOM/whitespace before the first `---`.
- **"binary files require Xet"**: You haven't run `git lfs track` + `git lfs migrate import` yet. The HF push rejects committed binaries that aren't LFS-blobs.
- **"Files larger than 10 MiB are not allowed"**: A single file is over 10 MB even after LFS. Move it out of the repo and link from the README.
- **"null quota limit" 403 error**: Org-level hardware quota is exhausted. Wait for capacity, ping a lablab admin in Discord, or push to a personal Space (Plan B).
- **App loads but "Connection refused" on Audit Demo**: The vLLM endpoint is down or the IP changed. SSH into the droplet and confirm `vllm serve` is running. Update `VLLM_BASE_URL` Secret if the IP rotated.
- **App loads but "401 Unauthorized" on every LLM call**: The `VLLM_API_KEY` Secret doesn't match the `--api-key` you passed to `vllm serve`. They have to be byte-for-byte identical.
---
## Cross-references
- [`docs/AMD_DEPLOYMENT.md`](AMD_DEPLOYMENT.md) β provisioning the AMD MI300X vLLM endpoint that this Space depends on
- [`docs/ARCHITECTURE.md`](ARCHITECTURE.md) β how the Streamlit app, the LangGraph multi-graph orchestrator, and the vLLM endpoint fit together
- [`docs/HF_SPACE_DEFAULT_GETTING_STARTED.md`](HF_SPACE_DEFAULT_GETTING_STARTED.md) β the canonical HF Spaces Quick Start that this guide builds on
- [`docs/SUBMISSION.md`](SUBMISSION.md) β full hackathon submission brief
|