Feature Demo: F007 β HuggingFace Deployment & Submission
Generated: 2026-03-29T07:33:23Z Context source: spec + discovery only (implementation not read) Feature entry: FEATURES.json #F007
What This Feature Does
F007 packages SQLEnv so a judge can actually consume it end-to-end: discover the project from README, run or visit the deployed Hugging Face Space, and use the training notebook workflow.
From a user perspective, the core value is trust and usability: deployment assets validate/build/push cleanly, and the submission package is runnable by someone outside the team.
What Is Already Proven
Verified in This Demo Run
- Ran deployment validation locally with
uv run openenv validate --verbose. - Built deployment image locally with
uv run openenv build -t openenv-sql-env-f007-hf-submission. - Ran authenticated deployment push with
uv run openenv pushtohttps://huggingface.co/spaces/hjerpe/sql_env. - Ran notebook/training E2E checks (
tests/e2e/test_training_e2e.py): 5 passed. - Ran full regression suite: 250 passed, 1 skipped.
Previously Verified Evidence
specs/FEATURES.jsonβverification_evidencefor F007: 250/250 tests passed, verifier approved.specs/F007-IMPLEMENTATION_SPEC.md(Section 1a) records authenticated build + push completion evidence.
What Still Needs User Verification
- Open the live Space in a browser and manually run a reset/step/answer episode flow.
- Open
notebooks/train_grpo.ipynbin Colab and execute cells in order on a clean runtime.
Quickstart / Verification Steps
Run these commands to see the feature in action:
uv run openenv validate --verbose
uv run openenv build -t openenv-sql-env-f007-hf-submission
uv run openenv push
Prereq: authenticated Hugging Face CLI/account with write access to target Space.
Live Local Proof
Validate Deployment Configuration
This confirms deployment mode support and flags non-Docker modes clearly.
uv run openenv validate --verbose
[OK] sql-env-F007-huggingface-deployment-submission: Ready for multi-mode deployment
Supported deployment modes:
[YES] docker
[YES] openenv_serve
[YES] uv_run
[YES] python_module
What to notice: All four deployment modes are supported.
Build the Hugging Face Deployment Image
uv run openenv build -t openenv-sql-env-f007-hf-submission
Building Docker image for: sql-env-F007-huggingface-deployment-submission
...
#18 naming to docker.io/library/openenv-sql-env-f007-hf-submission done
β Docker build successful
Done!
What to notice: image build completed successfully with the expected tag.
Push to Hugging Face Space
uv run openenv push
β Authenticated as: hjerpe
Creating/verifying space: hjerpe/sql_env
β Space hjerpe/sql_env is ready
Uploading files to hjerpe/sql_env...
β Upload completed successfully
Space URL: https://huggingface.co/spaces/hjerpe/sql_env
β Deployment complete!
Visit your space at: https://huggingface.co/spaces/hjerpe/sql_env
What to notice: authenticated push succeeded and produced a live Space URL.
Existing Evidence
- Verification spec target command (
uv run --with pytest pytest tests/ -v) was re-run in this demo and passed. - F007 entry in
specs/FEATURES.jsonalready recorded verifier approval before this refresh.
Manual Verification Checklist
- Open
https://huggingface.co/spaces/hjerpe/sql_env. - Confirm the app loads without startup errors.
- Start an episode (reset), then run at least one exploration step.
- Submit an answer action and confirm terminal response/reward appears.
- Open
notebooks/train_grpo.ipynbin Colab and run setup + connect + one training/eval pass.
Edge Cases Exercised
All deployment modes pass validation
uv run openenv validate --verbose
Supported deployment modes:
[YES] docker
[YES] openenv_serve
[YES] uv_run
[YES] python_module
This matters because all four modes pass cleanly β no warnings or caveats for the submission reviewer.
Verification-spec command drift (error case)
uv run --with pytest pytest tests/e2e/test_readme_completeness.py -v
ERROR: file or directory not found: tests/e2e/test_readme_completeness.py
collected 0 items
============================ no tests ran in 0.00s ============================
This matters because it reveals a spec-to-repo mismatch that should be corrected in verification artifacts.
Notebook pipeline smoke validation still passes
uv run --with pytest pytest tests/e2e/test_training_e2e.py -v
collected 5 items
...
============================== 5 passed in 11.33s ==============================
This confirms the training notebook path still has executable smoke coverage.
Test Evidence (Optional)
Supplementary proof that the feature works correctly across all scenarios.
| Test Suite | Tests | Status |
|---|---|---|
Full regression (uv run --with pytest pytest tests/ -v) |
251 collected | 250 passed, 1 skipped |
Training E2E (tests/e2e/test_training_e2e.py) |
5 | All passed |
Feature Links
- Implementation spec:
specs/F007-IMPLEMENTATION_SPEC.md - Verification spec:
specs/F007-VERIFICATION_SPEC.md
Demo generated by feature-demo agent. Re-run with /feature-demo F007 to refresh.