File size: 13,799 Bytes
32a012c
08c964e
 
 
 
32a012c
08c964e
 
 
 
32a012c
 
 
08c964e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
---
title: test
emoji: 🦞
colorFrom: blue
colorTo: indigo
sdk: docker
sdk_version: 29.0.4
python_version: 3.14.4
app_port: 7860
app_file: mian.py
pinned: false
---

# OpenClaw on Hugging Face Space (Docker)

> **Languages:** [English](./README.md) · [简体中文](./README_zh.md)
> **Deployment Guide:** [DEPLOY_GUIDE.md](./DEPLOY_GUIDE.md) | [中文部署指南](./DEPLOY_GUIDE_zh.md)

This setup is designed to provide the following:

- Build the OpenClaw container on top of `ubuntu:24.04`
- Serve the OpenClaw dashboard directly on port `7860` (default Space access port)
- Use third-party OpenAI-compatible `base_url + api_key` by default (injected via environment variables)
- Store OpenClaw config/workspace under `/root/.openclaw`
- Restore state automatically from a Hugging Face Dataset on startup
- Run scheduled backups of OpenClaw data to a Hugging Face Dataset via `cron` (as `root` user)
- Preinstall `python3`, `uv`, `vim`, `neovim`, `chromium` (via Chrome for Testing archive), `gh`, `hf`, `opencode`, `codex`, `claude` (Claude Code CLI), `@larksuite/cli` (with `npx skills add larksuite/cli -y -g`), and `sshx` in the image for interactive terminal use

## Repository Layout

- `Dockerfile`: Runtime image for the Space
- `scripts/openclaw-entrypoint.sh`: Main startup flow (restore, config generation, cron setup, gateway start)
- `openclaw_hf/backup.py`: Backup/restore implementation
- `scripts/openclaw-backup-cron.sh`: Cron entrypoint for backup jobs
- `scripts/openclaw-restore.sh`: Startup restore entrypoint
- `scripts/openclaw-gateway-restart`: Kill running `openclaw-gateway` processes
- `scripts/bootstrap-hf.sh`: Interactive bootstrap for Space/Dataset creation, upload, and Space variables/secrets setup (macOS/Linux)
- `scripts/bootstrap-hf.ps1`: Interactive bootstrap for Space/Dataset creation, upload, and Space variables/secrets setup (Windows PowerShell)
- `tests/test_backup.py`: Unit tests for the backup module
- `tests/test_entrypoint_config.py`: Unit tests for gateway config generation behavior

## Required Variables (Space Settings)

In your Hugging Face Space (`Settings -> Variables and secrets`), configure at least:

- Variable: `OPENCLAW_BACKUP_DATASET_REPO`: Backup target Dataset in `username/dataset-name` format
- Secret: `HF_TOKEN`: Used to write backups to the Dataset (must have write permission to that Dataset)
- Secret: `OPENCLAW_GATEWAY_TOKEN`: Gateway token (recommended; if omitted in deployment workflow, generate a random 32-character value)
- Secret: `OPENCLAW_GATEWAY_PASSWORD`: Gateway password (optional; if omitted in deployment workflow, generate a random 16-character value)

When using `./scripts/bootstrap-hf.sh` (macOS/Linux) or `./scripts/bootstrap-hf.ps1` (Windows PowerShell), these values are configured automatically on the target Space.

## Optional LLM Variables (All-Or-None)

Set all of these together only when you want OpenClaw to preconfigure a custom third-party model:

- Variable: `OPENCLAW_LLM_BASE_URL`: Third-party base URL (for example OpenAI-compatible `/v1`)
- Variable: `OPENCLAW_LLM_MODEL`: Third-party model ID
- Secret: `OPENCLAW_LLM_API_KEY`: Third-party API key

If any of the three is missing, entrypoint skips custom model generation.
In that case, you can still configure from inside the container (for example via `sshx`).

## Common Optional Variables

- `OPENCLAW_LLM_MODEL` (unset by default; used only when custom model preconfiguration is enabled)
- `OPENCLAW_LLM_PROVIDER` (default: `thirdparty`)
- `OPENCLAW_LLM_API` (default: `openai-completions`)
- `OPENCLAW_VERSION` (used by Docker install step; bootstrap prompts for it and defaults to the latest version detected from npm registry, fallback `latest`)
- `OPENCLAW_STATE_DIR` (default: `/root/.openclaw`)
- `OPENCLAW_USER` (default: `root`, runtime user for gateway and cron jobs)
- `OPENCLAW_GROUP` (default: `root`, runtime group for gateway and cron jobs)
- `OPENCLAW_CONFIG_PATH` (default: `/root/.openclaw/openclaw.json`)
- `OPENCLAW_WORKSPACE_DIR` (default: `/root/.openclaw/workspace`)
- `OPENCLAW_BACKUP_CRON` (default: `*/30 * * * *`, backup every 30 minutes)
- `OPENCLAW_BACKUP_SOURCE_DIR` (default: `/root/.openclaw`, backup/restore base directory for `openclaw-state`)
- `OPENCLAW_BACKUP_ROOT_CONFIG_DIR` (default: `/root/.config`, additional backup/restore directory for `root-config`)
- `OPENCLAW_BACKUP_ROOT_CODEX_DIR` (default: `/root/.codex`, additional backup/restore directory for `root-codex`)
- `OPENCLAW_BACKUP_ROOT_CLAUDE_DIR` (default: `/root/.claude`, additional backup/restore directory for `root-claude`)
- `OPENCLAW_BACKUP_ROOT_AGENTS_DIR` (default: `/root/.agents`, additional backup/restore directory for `root-agents`)
- `OPENCLAW_BACKUP_ROOT_SSH_DIR` (default: `/root/.ssh`, additional backup/restore directory for `root-ssh`)
- `OPENCLAW_BACKUP_ROOT_ENV_DIR` (default: `/root/.env.d`, additional backup/restore directory for `root-env`)
- `OPENCLAW_BACKUP_ROOT_NPM_DIR` (default: `/root/.npm`, additional backup/restore directory for `root-npm`)
- `OPENCLAW_BACKUP_ROOT_LARK_CLI_DIR` (default: `/root/.lark-cli`, additional backup/restore directory for `root-lark-cli`)
- `OPENCLAW_BACKUP_PATH_PREFIX` (default: `backups`)
- `OPENCLAW_BACKUP_KEEP_COUNT` (default: `48`, keep newest N timestamped backup archives; older ones are auto-deleted)
- `OPENCLAW_SSHX_AUTO_START` (default: `false`; set `true` to auto-run `sshx` in background on startup)
- `OPENCLAW_GATEWAY_AUTH_MODE` (default: `token`, optional: `password`)
- `OPENCLAW_GATEWAY_CONTROLUI_ALLOW_INSECURE_AUTH` (default: `false`)
- `OPENCLAW_GATEWAY_CONTROLUI_DANGEROUSLY_DISABLE_DEVICE_AUTH` (default: `false`; set `true` to bypass pairing, strongly discouraged on public networks)

## Quick Deployment

Run the interactive bootstrap script from repo root:

```bash
./scripts/bootstrap-hf.sh
```

```powershell
powershell -ExecutionPolicy ByPass -File .\scripts\bootstrap-hf.ps1
```

`bootstrap-hf.sh` / `bootstrap-hf.ps1` will:

- Check/install `hf` CLI:
  - macOS/Linux: `curl -LsSf https://hf.co/cli/install.sh | bash`
  - Windows PowerShell: `powershell -ExecutionPolicy ByPass -c "irm https://hf.co/cli/install.ps1 | iex"`
- Resolve HF auth first (before all other variables):
  - if `hf auth whoami` is not logged in: prompt `HF_TOKEN` and run `hf auth login --token <HF_TOKEN>`
  - if already logged in: ask whether to use current user
    - choose `yes`: continue
    - choose `no`: backup current token, prompt new `HF_TOKEN`, run `hf auth login --token <HF_TOKEN>`, and restore the previous token at the end
- Ask for `space_name`, `dataset_name`, `OPENCLAW_VERSION`, gateway token/password, and optional LLM settings
- Default `OPENCLAW_VERSION` to latest detected from npm registry (`openclaw`), fallback `latest` when detection fails
- Auto-generate `OPENCLAW_GATEWAY_TOKEN` (32 chars) and `OPENCLAW_GATEWAY_PASSWORD` (16 chars) if left empty
- Create private Space + Dataset and upload this repository
- Configure Space `Variables and secrets` automatically, including:
  - `OPENCLAW_BACKUP_DATASET_REPO`
  - `OPENCLAW_VERSION`
  - `HF_TOKEN`
  - `OPENCLAW_GATEWAY_TOKEN`
  - `OPENCLAW_GATEWAY_PASSWORD`
  - `OPENCLAW_GATEWAY_CONTROLUI_ALLOW_INSECURE_AUTH=false`
  - `OPENCLAW_GATEWAY_CONTROLUI_DANGEROUSLY_DISABLE_DEVICE_AUTH=false`
- Optionally configure LLM triplet and set `OPENCLAW_SSHX_AUTO_START` from prompt choice (`true`/`false`)
- Print planned deployment settings and require a final confirmation before creating/updating Space/Dataset resources
- Print Hugging Face Space page URL, app URL, and `/healthz`

If gateway token/password were auto-generated, the script prints them at the end.

## Agent Hand-off Prompt

Copy and send to your agent:

```
Please deploy OpenClaw to Hugging Face by strictly following the deployment skill in https://github.com/tenfyzhong/openclaw-hf/blob/main/SKILL.md
```

## Hugging Face Keep-Alive

How to keep a Space available depends on hardware tier:

- Free `cpu-basic`: the Space sleeps after inactivity (currently around 48h). It cannot be configured to run forever on free hardware.
- Paid hardware: the Space runs continuously by default. In `Settings -> Hardware`, set `Sleep time` to `Never` (or use API with `sleep_time=-1`) for true 24/7 availability.
- Cost-saving mode on paid hardware: set a custom `Sleep time` (for example `3600` seconds) so it auto-sleeps and auto-wakes on the next visit.

Space URL composition:

- Space repo ID format: `<owner>/<space_name>` (example: `tenfyzhong/openclaw-hf`)
- Public runtime host format: `https://<owner>-<space_name>.hf.space`
- OpenClaw health check URL: `https://<owner>-<space_name>.hf.space/healthz`
- Inside the Space runtime, Hugging Face also provides `SPACE_HOST`, so health URL can be built as `https://${SPACE_HOST}/healthz`.

Example:

```bash
OPENCLAW_HF_SPACE_ID="tenfyzhong/openclaw-hf"
SPACE_HOST="${OPENCLAW_HF_SPACE_ID/\//-}.hf.space"
HEALTH_URL="https://${SPACE_HOST}/healthz"
echo "$HEALTH_URL"
```

Keep-alive by periodic health checks:

```bash
*/12 * * * * HF_TOKEN=hf_xxx /path/to/repo/scripts/check-space-health.sh tenfyzhong/openclaw-hf >/dev/null || true
```

Notes:

- For private Spaces, unauthenticated calls to `https://<owner>-<space_name>.hf.space/healthz` return a Hub 404 page. This is expected access control behavior.
- For private Spaces, include `Authorization: Bearer <HF_TOKEN>` (the helper script above does this automatically via `HF_TOKEN` or `HUGGINGFACE_HUB_TOKEN`).
- This ping strategy is a practical workaround for reducing idle sleep on free hardware, but it is not a guaranteed always-on method.
- If you need strict 24/7 uptime, use paid hardware and set sleep time to `Never`.

References:

- <https://huggingface.co/docs/hub/spaces-gpus#sleep-time>
- <https://huggingface.co/docs/huggingface_hub/package_reference/space_runtime>
- <https://huggingface.co/docs/hub/spaces-overview>

Programmatic options (owner token required):

```python
from huggingface_hub import HfApi

api = HfApi(token="hf_xxx")
repo_id = "your-username/your-space"

# Keep running (paid hardware)
api.set_space_sleep_time(repo_id=repo_id, sleep_time=-1)

# Or sleep after 1 hour of inactivity
api.set_space_sleep_time(repo_id=repo_id, sleep_time=3600)

# Manual control
api.pause_space(repo_id=repo_id)
api.restart_space(repo_id=repo_id)
```

For this project, if you need stable dashboard access without cold starts, use paid hardware and set sleep time to `Never`.

## Backup/Restore Flow

- Startup restore: on container startup, it fetches `latest-backup.json` from the Dataset to locate the latest backup and restore:
  - `openclaw-state` -> `OPENCLAW_BACKUP_SOURCE_DIR` (default `/root/.openclaw`)
  - `root-config` -> `OPENCLAW_BACKUP_ROOT_CONFIG_DIR` (default `/root/.config`, restored only when present in archive)
  - `root-codex` -> `OPENCLAW_BACKUP_ROOT_CODEX_DIR` (default `/root/.codex`, restored only when present in archive)
  - `root-claude` -> `OPENCLAW_BACKUP_ROOT_CLAUDE_DIR` (default `/root/.claude`, restored only when present in archive)
  - `root-agents` -> `OPENCLAW_BACKUP_ROOT_AGENTS_DIR` (default `/root/.agents`, restored only when present in archive)
  - `root-ssh` -> `OPENCLAW_BACKUP_ROOT_SSH_DIR` (default `/root/.ssh`, restored only when present in archive)
  - `root-env` -> `OPENCLAW_BACKUP_ROOT_ENV_DIR` (default `/root/.env.d`, restored only when present in archive)
  - `root-npm` -> `OPENCLAW_BACKUP_ROOT_NPM_DIR` (default `/root/.npm`, restored only when present in archive)
  - `root-lark-cli` -> `OPENCLAW_BACKUP_ROOT_LARK_CLI_DIR` (default `/root/.lark-cli`, restored only when present in archive)
- Scheduled backup: cron runs based on `OPENCLAW_BACKUP_CRON`
- Shutdown backup: when the container receives a stop signal, one final backup is uploaded before exit
- Each backup upload includes:
  - `backups/openclaw-backup-<timestamp>.tar.gz`
    - archive root `openclaw-state/`
    - archive root `root-config/` (included when `OPENCLAW_BACKUP_ROOT_CONFIG_DIR` exists)
    - archive root `root-codex/` (included when `OPENCLAW_BACKUP_ROOT_CODEX_DIR` exists)
    - archive root `root-claude/` (included when `OPENCLAW_BACKUP_ROOT_CLAUDE_DIR` exists)
    - archive root `root-agents/` (included when `OPENCLAW_BACKUP_ROOT_AGENTS_DIR` exists)
    - archive root `root-ssh/` (included when `OPENCLAW_BACKUP_ROOT_SSH_DIR` exists)
    - archive root `root-env/` (included when `OPENCLAW_BACKUP_ROOT_ENV_DIR` exists)
    - archive root `root-npm/` (included when `OPENCLAW_BACKUP_ROOT_NPM_DIR` exists)
    - archive root `root-lark-cli/` (included when `OPENCLAW_BACKUP_ROOT_LARK_CLI_DIR` exists)
  - `latest-backup.json`
- Retention: after upload, only the newest `OPENCLAW_BACKUP_KEEP_COUNT` timestamped archives are kept (default `48`); older timestamped archives are deleted automatically

## Use sshx Inside the Container

`sshx` is preinstalled in the image.

1. Auto-start `sshx` in background via environment variables:

```bash
OPENCLAW_SSHX_AUTO_START=true
```

When enabled, entrypoint starts `sshx` in background and sends `sshx` output directly to container stdout/stderr logs (no file logging).

2. Manual start inside container:

```bash
sshx
```

3. Let OpenClaw start a process itself (run in OpenClaw terminal/tool):

```bash
nohup sshx >/proc/1/fd/1 2>/proc/1/fd/2 &
```

4. After use, close `sshx` process promptly:

```bash
pgrep -fa sshx
pkill -TERM -f '(^|/)sshx($| )'
```

## Local Test

```bash
python3 -m unittest discover -s tests -p 'test_*.py'
```

Pull Requests to `main` run GitHub Actions CI automatically (`.github/workflows/pr-ci.yml`):
- Unit tests: `python3 -m unittest discover -s tests -p 'test_*.py'`
- Docker image build: `docker build` (via Buildx) with `OPENCLAW_VERSION=latest`

## License

MIT. See `LICENSE`.