Spaces:

hjerpe
/

sql_env

Running

App Files Files Community

sql_env / specs /F007-DEMO.md

hjerpe

Upload folder using huggingface_hub

9e64e71 verified 7 days ago

preview code

raw

history blame contribute delete

5.48 kB

Feature Demo: F007 — HuggingFace Deployment & Submission

Generated: 2026-03-29T07:33:23Z Context source: spec + discovery only (implementation not read) Feature entry: FEATURES.json #F007

What This Feature Does

F007 packages SQLEnv so a judge can actually consume it end-to-end: discover the project from README, run or visit the deployed Hugging Face Space, and use the training notebook workflow.

From a user perspective, the core value is trust and usability: deployment assets validate/build/push cleanly, and the submission package is runnable by someone outside the team.

What Is Already Proven

Verified in This Demo Run

Ran deployment validation locally with uv run openenv validate --verbose.
Built deployment image locally with uv run openenv build -t openenv-sql-env-f007-hf-submission.
Ran authenticated deployment push with uv run openenv push to https://huggingface.co/spaces/hjerpe/sql_env.
Ran notebook/training E2E checks (tests/e2e/test_training_e2e.py): 5 passed.
Ran full regression suite: 250 passed, 1 skipped.

Previously Verified Evidence

specs/FEATURES.json → verification_evidence for F007: 250/250 tests passed, verifier approved.
specs/F007-IMPLEMENTATION_SPEC.md (Section 1a) records authenticated build + push completion evidence.

What Still Needs User Verification

Open the live Space in a browser and manually run a reset/step/answer episode flow.
Open notebooks/train_grpo.ipynb in Colab and execute cells in order on a clean runtime.

Quickstart / Verification Steps

Run these commands to see the feature in action:

uv run openenv validate --verbose
uv run openenv build -t openenv-sql-env-f007-hf-submission
uv run openenv push

Prereq: authenticated Hugging Face CLI/account with write access to target Space.

Live Local Proof

Validate Deployment Configuration

This confirms deployment mode support and flags non-Docker modes clearly.

uv run openenv validate --verbose

[OK] sql-env-F007-huggingface-deployment-submission: Ready for multi-mode deployment

Supported deployment modes:
  [YES] docker
  [YES] openenv_serve
  [YES] uv_run
  [YES] python_module

What to notice: All four deployment modes are supported.

Build the Hugging Face Deployment Image

uv run openenv build -t openenv-sql-env-f007-hf-submission

Building Docker image for: sql-env-F007-huggingface-deployment-submission
...
#18 naming to docker.io/library/openenv-sql-env-f007-hf-submission done
✓ Docker build successful

Done!

What to notice: image build completed successfully with the expected tag.

Push to Hugging Face Space

uv run openenv push

✓ Authenticated as: hjerpe
Creating/verifying space: hjerpe/sql_env
✓ Space hjerpe/sql_env is ready
Uploading files to hjerpe/sql_env...
✓ Upload completed successfully
Space URL: https://huggingface.co/spaces/hjerpe/sql_env

✓ Deployment complete!
Visit your space at: https://huggingface.co/spaces/hjerpe/sql_env

What to notice: authenticated push succeeded and produced a live Space URL.

Existing Evidence

Verification spec target command (uv run --with pytest pytest tests/ -v) was re-run in this demo and passed.
F007 entry in specs/FEATURES.json already recorded verifier approval before this refresh.

Manual Verification Checklist

Open https://huggingface.co/spaces/hjerpe/sql_env.
Confirm the app loads without startup errors.
Start an episode (reset), then run at least one exploration step.
Submit an answer action and confirm terminal response/reward appears.
Open notebooks/train_grpo.ipynb in Colab and run setup + connect + one training/eval pass.

Edge Cases Exercised

All deployment modes pass validation

uv run openenv validate --verbose

Supported deployment modes:
  [YES] docker
  [YES] openenv_serve
  [YES] uv_run
  [YES] python_module

This matters because all four modes pass cleanly — no warnings or caveats for the submission reviewer.

Verification-spec command drift (error case)

uv run --with pytest pytest tests/e2e/test_readme_completeness.py -v

ERROR: file or directory not found: tests/e2e/test_readme_completeness.py
collected 0 items
============================ no tests ran in 0.00s ============================

This matters because it reveals a spec-to-repo mismatch that should be corrected in verification artifacts.

Notebook pipeline smoke validation still passes

uv run --with pytest pytest tests/e2e/test_training_e2e.py -v

collected 5 items
...
============================== 5 passed in 11.33s ==============================

This confirms the training notebook path still has executable smoke coverage.

Test Evidence (Optional)

Supplementary proof that the feature works correctly across all scenarios.

Test Suite	Tests	Status
Full regression (`uv run --with pytest pytest tests/ -v`)	251 collected	250 passed, 1 skipped
Training E2E (`tests/e2e/test_training_e2e.py`)	5	All passed

Feature Links

Implementation spec: specs/F007-IMPLEMENTATION_SPEC.md
Verification spec: specs/F007-VERIFICATION_SPEC.md

Demo generated by feature-demo agent. Re-run with /feature-demo F007 to refresh.