action / README.md
GGSheng's picture
feat: deploy Gemma 4 to hf space
020c337 verified
metadata
title: action
emoji: 🦞
colorFrom: blue
colorTo: indigo
sdk: docker
sdk_version: 29.0.4
python_version: 3.14.4
app_port: 7860
app_file: mian.py
pinned: false

OpenClaw on Hugging Face Space (Docker)

Languages: English · 简体中文 Deployment Guide: DEPLOY_GUIDE.md | 中文部署指南

This setup is designed to provide the following:

  • Build the OpenClaw container on top of ubuntu:24.04
  • Serve the OpenClaw dashboard directly on port 7860 (default Space access port)
  • Use third-party OpenAI-compatible base_url + api_key by default (injected via environment variables)
  • Store OpenClaw config/workspace under /root/.openclaw
  • Restore state automatically from a Hugging Face Dataset on startup
  • Run scheduled backups of OpenClaw data to a Hugging Face Dataset via cron (as root user)
  • Preinstall python3, uv, vim, neovim, chromium (via Chrome for Testing archive), gh, hf, opencode, codex, claude (Claude Code CLI), @larksuite/cli (with npx skills add larksuite/cli -y -g), and sshx in the image for interactive terminal use

Repository Layout

  • Dockerfile: Runtime image for the Space
  • scripts/openclaw-entrypoint.sh: Main startup flow (restore, config generation, cron setup, gateway start)
  • openclaw_hf/backup.py: Backup/restore implementation
  • scripts/openclaw-backup-cron.sh: Cron entrypoint for backup jobs
  • scripts/openclaw-restore.sh: Startup restore entrypoint
  • scripts/openclaw-gateway-restart: Kill running openclaw-gateway processes
  • scripts/bootstrap-hf.sh: Interactive bootstrap for Space/Dataset creation, upload, and Space variables/secrets setup (macOS/Linux)
  • scripts/bootstrap-hf.ps1: Interactive bootstrap for Space/Dataset creation, upload, and Space variables/secrets setup (Windows PowerShell)
  • tests/test_backup.py: Unit tests for the backup module
  • tests/test_entrypoint_config.py: Unit tests for gateway config generation behavior

Required Variables (Space Settings)

In your Hugging Face Space (Settings -> Variables and secrets), configure at least:

  • Variable: OPENCLAW_BACKUP_DATASET_REPO: Backup target Dataset in username/dataset-name format
  • Secret: HF_TOKEN: Used to write backups to the Dataset (must have write permission to that Dataset)
  • Secret: OPENCLAW_GATEWAY_TOKEN: Gateway token (recommended; if omitted in deployment workflow, generate a random 32-character value)
  • Secret: OPENCLAW_GATEWAY_PASSWORD: Gateway password (optional; if omitted in deployment workflow, generate a random 16-character value)

When using ./scripts/bootstrap-hf.sh (macOS/Linux) or ./scripts/bootstrap-hf.ps1 (Windows PowerShell), these values are configured automatically on the target Space.

Optional LLM Variables (All-Or-None)

Set all of these together only when you want OpenClaw to preconfigure a custom third-party model:

  • Variable: OPENCLAW_LLM_BASE_URL: Third-party base URL (for example OpenAI-compatible /v1)
  • Variable: OPENCLAW_LLM_MODEL: Third-party model ID
  • Secret: OPENCLAW_LLM_API_KEY: Third-party API key

If any of the three is missing, entrypoint skips custom model generation. In that case, you can still configure from inside the container (for example via sshx).

Common Optional Variables

  • OPENCLAW_LLM_MODEL (unset by default; used only when custom model preconfiguration is enabled)
  • OPENCLAW_LLM_PROVIDER (default: thirdparty)
  • OPENCLAW_LLM_API (default: openai-completions)
  • OPENCLAW_VERSION (used by Docker install step; bootstrap prompts for it and defaults to the latest version detected from npm registry, fallback latest)
  • OPENCLAW_STATE_DIR (default: /root/.openclaw)
  • OPENCLAW_USER (default: root, runtime user for gateway and cron jobs)
  • OPENCLAW_GROUP (default: root, runtime group for gateway and cron jobs)
  • OPENCLAW_CONFIG_PATH (default: /root/.openclaw/openclaw.json)
  • OPENCLAW_WORKSPACE_DIR (default: /root/.openclaw/workspace)
  • OPENCLAW_BACKUP_CRON (default: */30 * * * *, backup every 30 minutes)
  • OPENCLAW_BACKUP_SOURCE_DIR (default: /root/.openclaw, backup/restore base directory for openclaw-state)
  • OPENCLAW_BACKUP_ROOT_CONFIG_DIR (default: /root/.config, additional backup/restore directory for root-config)
  • OPENCLAW_BACKUP_ROOT_CODEX_DIR (default: /root/.codex, additional backup/restore directory for root-codex)
  • OPENCLAW_BACKUP_ROOT_CLAUDE_DIR (default: /root/.claude, additional backup/restore directory for root-claude)
  • OPENCLAW_BACKUP_ROOT_AGENTS_DIR (default: /root/.agents, additional backup/restore directory for root-agents)
  • OPENCLAW_BACKUP_ROOT_SSH_DIR (default: /root/.ssh, additional backup/restore directory for root-ssh)
  • OPENCLAW_BACKUP_ROOT_ENV_DIR (default: /root/.env.d, additional backup/restore directory for root-env)
  • OPENCLAW_BACKUP_ROOT_NPM_DIR (default: /root/.npm, additional backup/restore directory for root-npm)
  • OPENCLAW_BACKUP_ROOT_LARK_CLI_DIR (default: /root/.lark-cli, additional backup/restore directory for root-lark-cli)
  • OPENCLAW_BACKUP_PATH_PREFIX (default: backups)
  • OPENCLAW_BACKUP_KEEP_COUNT (default: 24, keep newest N timestamped backup archives; older ones are auto-deleted)
  • OPENCLAW_SSHX_AUTO_START (default: false; set true to auto-run sshx in background on startup)
  • OPENCLAW_GATEWAY_AUTH_MODE (default: token, optional: password)
  • OPENCLAW_GATEWAY_CONTROLUI_ALLOW_INSECURE_AUTH (default: false)
  • OPENCLAW_GATEWAY_CONTROLUI_DANGEROUSLY_DISABLE_DEVICE_AUTH (default: false; set true to bypass pairing, strongly discouraged on public networks)

Quick Deployment

Run the interactive bootstrap script from repo root:

./scripts/bootstrap-hf.sh
powershell -ExecutionPolicy ByPass -File .\scripts\bootstrap-hf.ps1

bootstrap-hf.sh / bootstrap-hf.ps1 will:

  • Check/install hf CLI:
    • macOS/Linux: curl -LsSf https://hf.co/cli/install.sh | bash
    • Windows PowerShell: powershell -ExecutionPolicy ByPass -c "irm https://hf.co/cli/install.ps1 | iex"
  • Resolve HF auth first (before all other variables):
    • if hf auth whoami is not logged in: prompt HF_TOKEN and run hf auth login --token <HF_TOKEN>
    • if already logged in: ask whether to use current user
      • choose yes: continue
      • choose no: backup current token, prompt new HF_TOKEN, run hf auth login --token <HF_TOKEN>, and restore the previous token at the end
  • Ask for space_name, dataset_name, OPENCLAW_VERSION, gateway token/password, and optional LLM settings
  • Default OPENCLAW_VERSION to latest detected from npm registry (openclaw), fallback latest when detection fails
  • Auto-generate OPENCLAW_GATEWAY_TOKEN (32 chars) and OPENCLAW_GATEWAY_PASSWORD (16 chars) if left empty
  • Create private Space + Dataset and upload this repository
  • Configure Space Variables and secrets automatically, including:
    • OPENCLAW_BACKUP_DATASET_REPO
    • OPENCLAW_VERSION
    • HF_TOKEN
    • OPENCLAW_GATEWAY_TOKEN
    • OPENCLAW_GATEWAY_PASSWORD
    • OPENCLAW_GATEWAY_CONTROLUI_ALLOW_INSECURE_AUTH=false
    • OPENCLAW_GATEWAY_CONTROLUI_DANGEROUSLY_DISABLE_DEVICE_AUTH=false
  • Optionally configure LLM triplet and set OPENCLAW_SSHX_AUTO_START from prompt choice (true/false)
  • Print planned deployment settings and require a final confirmation before creating/updating Space/Dataset resources
  • Print Hugging Face Space page URL, app URL, and /healthz

If gateway token/password were auto-generated, the script prints them at the end.

Agent Hand-off Prompt

Copy and send to your agent:

Please deploy OpenClaw to Hugging Face by strictly following the deployment skill in https://github.com/tenfyzhong/openclaw-hf/blob/main/SKILL.md

Hugging Face Keep-Alive

How to keep a Space available depends on hardware tier:

  • Free cpu-basic: the Space sleeps after inactivity (currently around 48h). It cannot be configured to run forever on free hardware.
  • Paid hardware: the Space runs continuously by default. In Settings -> Hardware, set Sleep time to Never (or use API with sleep_time=-1) for true 24/7 availability.
  • Cost-saving mode on paid hardware: set a custom Sleep time (for example 3600 seconds) so it auto-sleeps and auto-wakes on the next visit.

Space URL composition:

  • Space repo ID format: <owner>/<space_name> (example: tenfyzhong/openclaw-hf)
  • Public runtime host format: https://<owner>-<space_name>.hf.space
  • OpenClaw health check URL: https://<owner>-<space_name>.hf.space/healthz
  • Inside the Space runtime, Hugging Face also provides SPACE_HOST, so health URL can be built as https://${SPACE_HOST}/healthz.

Example:

OPENCLAW_HF_SPACE_ID="tenfyzhong/openclaw-hf"
SPACE_HOST="${OPENCLAW_HF_SPACE_ID/\//-}.hf.space"
HEALTH_URL="https://${SPACE_HOST}/healthz"
echo "$HEALTH_URL"

Keep-alive by periodic health checks:

*/12 * * * * HF_TOKEN=hf_xxx /path/to/repo/scripts/check-space-health.sh tenfyzhong/openclaw-hf >/dev/null || true

Notes:

  • For private Spaces, unauthenticated calls to https://<owner>-<space_name>.hf.space/healthz return a Hub 404 page. This is expected access control behavior.
  • For private Spaces, include Authorization: Bearer <HF_TOKEN> (the helper script above does this automatically via HF_TOKEN or HUGGINGFACE_HUB_TOKEN).
  • This ping strategy is a practical workaround for reducing idle sleep on free hardware, but it is not a guaranteed always-on method.
  • If you need strict 24/7 uptime, use paid hardware and set sleep time to Never.

References:

Programmatic options (owner token required):

from huggingface_hub import HfApi

api = HfApi(token="hf_xxx")
repo_id = "your-username/your-space"

# Keep running (paid hardware)
api.set_space_sleep_time(repo_id=repo_id, sleep_time=-1)

# Or sleep after 1 hour of inactivity
api.set_space_sleep_time(repo_id=repo_id, sleep_time=3600)

# Manual control
api.pause_space(repo_id=repo_id)
api.restart_space(repo_id=repo_id)

For this project, if you need stable dashboard access without cold starts, use paid hardware and set sleep time to Never.

Backup/Restore Flow

  • Startup restore: on container startup, it fetches latest-backup.json from the Dataset to locate the latest backup and restore:
    • openclaw-state -> OPENCLAW_BACKUP_SOURCE_DIR (default /root/.openclaw)
    • root-config -> OPENCLAW_BACKUP_ROOT_CONFIG_DIR (default /root/.config, restored only when present in archive)
    • root-codex -> OPENCLAW_BACKUP_ROOT_CODEX_DIR (default /root/.codex, restored only when present in archive)
    • root-claude -> OPENCLAW_BACKUP_ROOT_CLAUDE_DIR (default /root/.claude, restored only when present in archive)
    • root-agents -> OPENCLAW_BACKUP_ROOT_AGENTS_DIR (default /root/.agents, restored only when present in archive)
    • root-ssh -> OPENCLAW_BACKUP_ROOT_SSH_DIR (default /root/.ssh, restored only when present in archive)
    • root-env -> OPENCLAW_BACKUP_ROOT_ENV_DIR (default /root/.env.d, restored only when present in archive)
    • root-npm -> OPENCLAW_BACKUP_ROOT_NPM_DIR (default /root/.npm, restored only when present in archive)
    • root-lark-cli -> OPENCLAW_BACKUP_ROOT_LARK_CLI_DIR (default /root/.lark-cli, restored only when present in archive)
  • Scheduled backup: cron runs based on OPENCLAW_BACKUP_CRON
  • Shutdown backup: when the container receives a stop signal, one final backup is uploaded before exit
  • Each backup upload includes:
    • backups/openclaw-backup-<timestamp>.tar.gz
      • archive root openclaw-state/
      • archive root root-config/ (included when OPENCLAW_BACKUP_ROOT_CONFIG_DIR exists)
      • archive root root-codex/ (included when OPENCLAW_BACKUP_ROOT_CODEX_DIR exists)
      • archive root root-claude/ (included when OPENCLAW_BACKUP_ROOT_CLAUDE_DIR exists)
      • archive root root-agents/ (included when OPENCLAW_BACKUP_ROOT_AGENTS_DIR exists)
      • archive root root-ssh/ (included when OPENCLAW_BACKUP_ROOT_SSH_DIR exists)
      • archive root root-env/ (included when OPENCLAW_BACKUP_ROOT_ENV_DIR exists)
      • archive root root-npm/ (included when OPENCLAW_BACKUP_ROOT_NPM_DIR exists)
      • archive root root-lark-cli/ (included when OPENCLAW_BACKUP_ROOT_LARK_CLI_DIR exists)
    • latest-backup.json
  • Retention: after upload, only the newest OPENCLAW_BACKUP_KEEP_COUNT timestamped archives are kept (default 24); older timestamped archives are deleted automatically

Use sshx Inside the Container

sshx is preinstalled in the image.

  1. Auto-start sshx in background via environment variables:
OPENCLAW_SSHX_AUTO_START=true

When enabled, entrypoint starts sshx in background and sends sshx output directly to container stdout/stderr logs (no file logging).

  1. Manual start inside container:
sshx
  1. Let OpenClaw start a process itself (run in OpenClaw terminal/tool):
nohup sshx >/proc/1/fd/1 2>/proc/1/fd/2 &
  1. After use, close sshx process promptly:
pgrep -fa sshx
pkill -TERM -f '(^|/)sshx($| )'

Local Test

python3 -m unittest discover -s tests -p 'test_*.py'

Pull Requests to main run GitHub Actions CI automatically (.github/workflows/pr-ci.yml):

  • Unit tests: python3 -m unittest discover -s tests -p 'test_*.py'
  • Docker image build: docker build (via Buildx) with OPENCLAW_VERSION=latest

License

MIT. See LICENSE.