| # 备份与还原 |
|
|
| ## 概述 |
|
|
| OpenClaw 使用 Hugging Face Dataset 作为备份存储后端,支持增量备份、分卷压缩、动态备份策略等企业级功能。 |
|
|
| ## 核心概念 |
|
|
| ### 备份类型 |
|
|
| | 类型 | 说明 | |
| |------|------| |
| | **完整备份 (Full)** | 备份所有指定文件和目录 | |
| | **增量备份 (Incremental)** | 仅备份自上次备份以来变化的文件 | |
| | **分卷 (Split)** | 将大备份分割成多个小文件(默认500MB),是完整/增量备份的可选处理方式 | |
| | **加密 (Encrypted)** | 使用 AES-256-CBC 加密归档文件,是完整/增量备份的可选处理方式 | |
|
|
| ### 两层元数据设计 |
|
|
| 系统使用两层元数据文件: |
|
|
| #### 1. 备份文件名命名规则 |
|
|
| 备份文件名格式:`openclaw-backup-{时间戳}-{类型}[-split].tar.gz` |
|
|
| **组成部分**: |
| - `{时间戳}`:格式 `YYYYMMDD-HHmmss`,如 `20260413-120000` |
| - `{类型}`: |
| - `full`:完整备份 |
| - `inc`:增量备份 |
| - `[-split]`:可选后缀,表示分卷备份 |
|
|
| **类型后缀规则**: |
| | 类型 | 后缀 | 示例 | |
| |------|------|------| |
| | 单文件完整备份 | `full` | `openclaw-backup-20260413-120000-full.tar.gz` | |
| | 单文件增量备份 | `inc` | `openclaw-backup-20260414-060000-inc.tar.gz` | |
| | 分卷完整备份 | `full-split` | `openclaw-backup-20260414-090000-full-split.tar.gz` | |
| | 分卷增量备份 | `inc-split` | `openclaw-backup-20260414-150000-inc-split.tar.gz` | |
|
|
| **注意**: |
| 1. 分卷文件在主文件名后加 `.part-{aa,ab,...}` 后缀 |
| 2. 分卷数量由归档后实际大小决定(超过 `OPENCLAW_BACKUP_SPLIT_SIZE` 默认500MB时分卷) |
| 3. `-split` 后缀仅用于区分分卷备份(完整和增量都适用) |
| 4. 加密归档在原文件名后加 `.enc` 后缀(如 `openclaw-backup-20260413-120000-full.tar.gz.enc`) |
| 5. 加密与分卷可同时使用,加密后分卷的文件名格式:`原名.tar.gz.enc.part-{aa,ab,...}` |
|
|
| #### 2. 备份索引文件(每个备份独立) |
|
|
| 每个备份上传后都会在远端创建一个对应的元数据索引文件,命名格式为 `{归档名}.meta.json`。 |
|
|
| **示例远端存储结构**(按文件名/时间排序): |
|
|
| ``` |
| Dataset根目录/ |
| ├── latest-backup.json ← 与backups/同级,必须在根目录下 |
| └── backups/ |
| ├── openclaw-backup-20260413-120000-full.tar.gz.meta.json ← 单文件完整备份 |
| ├── openclaw-backup-20260413-120000-full.tar.gz |
| ├── openclaw-backup-20260414-060000-inc.tar.gz.meta.json ← 单文件增量备份 |
| ├── openclaw-backup-20260414-060000-inc.tar.gz |
| ├── openclaw-backup-20260414-072200-inc.tar.gz.meta.json ← 单文件增量备份 |
| ├── openclaw-backup-20260414-072200-inc.tar.gz |
| ├── openclaw-backup-20260414-081100-full.tar.gz.meta.json ← 单文件完整备份 |
| ├── openclaw-backup-20260414-081100-full.tar.gz |
| ├── openclaw-backup-20260414-090000-full-split.tar.gz.part-aa ← 分卷完整备份 |
| ├── openclaw-backup-20260414-090000-full-split.tar.gz.part-ab |
| ├── openclaw-backup-20260414-090000-full-split.tar.gz.meta.json |
| ├── openclaw-backup-20260414-150000-inc-split.tar.gz.part-aa ← 分卷增量备份 |
| ├── openclaw-backup-20260414-150000-inc-split.tar.gz.part-ab |
| ├── openclaw-backup-20260414-150000-inc-split.tar.gz.meta.json |
| ├── openclaw-backup-20260414-152400-inc-split.tar.gz.part-aa ← 分卷增量备份 |
| ├── openclaw-backup-20260414-152400-inc-split.tar.gz.part-ab |
| └── openclaw-backup-20260414-152400-inc-split.tar.gz.meta.json |
| ``` |
|
|
| **说明**:按文件名(时间)升序排列。每个分卷备份的文件组顺序为:分卷卷 → meta.json(分卷后原tar.gz不再保留)。 |
|
|
| **索引文件内容分场景示例**: |
|
|
| **场景A:单文件完整备份** |
| ```json5 |
| // openclaw-backup-20260413-120000-full.tar.gz.meta.json |
| { |
| "volumes": ["openclaw-backup-20260413-120000-full.tar.gz"], |
| "is_split": false, |
| "backup_type": "full", |
| "chain_id": "abc123", |
| "parent": null, |
| "created_at_utc": "2026-04-13T12:00:00", |
| "file_count": 42, |
| "archive_size": 1258291200, |
| "checksum": "sha256:def456...", |
| "version": "2.1", |
| "created_by": "openclaw-backup" |
| } |
| ``` |
|
|
| ```json5 |
| // openclaw-backup-20260414-081100-full.tar.gz.meta.json |
| { |
| "volumes": ["openclaw-backup-20260414-081100-full.tar.gz"], |
| "is_split": false, |
| "backup_type": "full", |
| "chain_id": "xyz789", |
| "parent": null, |
| "created_at_utc": "2026-04-14T08:11:00", |
| "file_count": 42, |
| "archive_size": 1258291200, |
| "checksum": "sha256:def456...", |
| "version": "2.1", |
| "created_by": "openclaw-backup" |
| } |
| ``` |
|
|
| **场景B:单文件增量备份** |
| ```json5 |
| // openclaw-backup-20260414-060000-inc.tar.gz.meta.json |
| { |
| "volumes": ["openclaw-backup-20260414-060000-inc.tar.gz"], |
| "is_split": false, |
| "backup_type": "incremental", |
| "chain_id": "abc123", |
| "parent": "openclaw-backup-20260413-120000-full.tar.gz.meta.json", |
| "created_at_utc": "2026-04-14T06:00:00", |
| "file_count": 5, |
| "archive_size": 52428800, |
| "checksum": "sha256:ghi789...", |
| "version": "2.1", |
| "created_by": "openclaw-backup" |
| } |
| ``` |
|
|
| ```json5 |
| // openclaw-backup-20260414-072200-inc.tar.gz.meta.json |
| { |
| "volumes": ["openclaw-backup-20260414-072200-inc.tar.gz"], |
| "is_split": false, |
| "backup_type": "incremental", |
| "chain_id": "abc123", |
| "parent": "openclaw-backup-20260414-060000-inc.tar.gz.meta.json", |
| "created_at_utc": "2026-04-14T07:22:00", |
| "file_count": 3, |
| "archive_size": 52428800, |
| "checksum": "sha256:ghi789...", |
| "version": "2.1", |
| "created_by": "openclaw-backup" |
| } |
| ``` |
|
|
| **场景C:分卷完整备份** |
| ```json5 |
| // openclaw-backup-20260414-090000-full-split.tar.gz.meta.json |
| { |
| "volumes": [ |
| "openclaw-backup-20260414-090000-full-split.tar.gz.part-aa", |
| "openclaw-backup-20260414-090000-full-split.tar.gz.part-ab" |
| ], |
| "is_split": true, |
| "backup_type": "full", |
| "chain_id": "xyz789", |
| "parent": null, |
| "created_at_utc": "2026-04-14T09:00:00", |
| "file_count": 128, |
| "archive_size": 2147483648, |
| "checksum": "sha256:jkl012...", |
| "version": "2.1", |
| "created_by": "openclaw-backup" |
| } |
| ``` |
|
|
| **场景D:分卷增量备份** |
| ```json5 |
| // openclaw-backup-20260414-150000-inc-split.tar.gz.meta.json |
| { |
| "volumes": [ |
| "openclaw-backup-20260414-150000-inc-split.tar.gz.part-aa", |
| "openclaw-backup-20260414-150000-inc-split.tar.gz.part-ab" |
| ], |
| "is_split": true, |
| "backup_type": "incremental", |
| "chain_id": "xyz789", |
| "parent": "openclaw-backup-20260414-090000-full-split.tar.gz.meta.json", |
| "created_at_utc": "2026-04-14T15:00:00", |
| "file_count": 64, |
| "archive_size": 1073741824, |
| "checksum": "sha256:mno345...", |
| "version": "2.1", |
| "created_by": "openclaw-backup" |
| } |
| ``` |
|
|
| ```json5 |
| // openclaw-backup-20260414-152400-inc-split.tar.gz.meta.json |
| { |
| "volumes": [ |
| "openclaw-backup-20260414-152400-inc-split.tar.gz.part-aa", |
| "openclaw-backup-20260414-152400-inc-split.tar.gz.part-ab" |
| ], |
| "is_split": true, |
| "backup_type": "incremental", |
| "chain_id": "xyz789", |
| "parent": "openclaw-backup-20260414-150000-inc-split.tar.gz.meta.json", |
| "created_at_utc": "2026-04-14T15:24:00", |
| "file_count": 8, |
| "archive_size": 1073741824, |
| "checksum": "sha256:mno345...", |
| "version": "2.1", |
| "created_by": "openclaw-backup" |
| } |
| ``` |
|
|
| **场景E:加密备份(单文件)** |
| ```json5 |
| // openclaw-backup-20260414-160000-full.tar.gz.meta.json |
| { |
| "volumes": ["openclaw-backup-20260414-160000-full.tar.gz.enc"], |
| "is_split": false, |
| "backup_type": "full", |
| "chain_id": "enc001", |
| "parent": null, |
| "created_at_utc": "2026-04-14T16:00:00", |
| "file_count": 42, |
| "archive_size": 1300234240, |
| "checksum": "sha256:enc123...", |
| "version": "2.1", |
| "created_by": "openclaw-backup", |
| "encrypted": true, |
| "encryption_algorithm": "AES-256-CBC" |
| } |
| ``` |
|
|
| **场景F:加密+分卷备份** |
| ```json5 |
| // openclaw-backup-20260414-170000-full.tar.gz.meta.json |
| { |
| "volumes": [ |
| "openclaw-backup-20260414-170000-full.tar.gz.enc.part-aa", |
| "openclaw-backup-20260414-170000-full.tar.gz.enc.part-ab", |
| "openclaw-backup-20260414-170000-full.tar.gz.enc.part-ac" |
| ], |
| "is_split": true, |
| "backup_type": "full", |
| "chain_id": "enc002", |
| "parent": null, |
| "created_at_utc": "2026-04-14T17:00:00", |
| "file_count": 128, |
| "archive_size": 3221225472, |
| "checksum": "sha256:enc456...", |
| "version": "2.1", |
| "created_by": "openclaw-backup", |
| "encrypted": true, |
| "encryption_algorithm": "AES-256-CBC" |
| } |
| ``` |
|
|
| **字段说明**: |
|
|
| | 字段 | 说明 | |
| |------|------| |
| | `volumes` | 归档文件列表(单文件时为单个元素,分卷时包含所有分卷) | |
| | `is_split` | 是否分卷备份 | |
| | `backup_type` | `full`(完整)或 `incremental`(增量) | |
| | `chain_id` | 备份链标识,同一链的所有备份共享此ID(详见下方生成规则) | |
| | `parent` | 父备份的 `.meta.json` 文件路径(完整备份为null) | |
| | `created_at_utc` | 备份创建时间(UTC) | |
| | `file_count` | 归档内包含的文件数量 | |
| | `archive_size` | 所有归档文件总大小(字节) | |
| | `checksum` | 所有归档合并后的SHA256校验和 | |
| | `version` | 元数据格式版本 | |
| | `created_by` | 创建工具标识 | |
| | `encrypted` | 是否加密(可选,默认为false) | |
| | `encryption_algorithm` | 加密算法(可选,如 `AES-256-CBC`) | |
|
|
| **chain_id 生成规则**: |
| |
| 1. **首次创建链**:生成新的 UUID 前8位作为 chain_id |
| 2. **增量备份**:继承父备份的 chain_id |
| 3. **独立完整备份**:生成新的 chain_id(即使上一个链未完成) |
| 4. **分卷备份**:继承父备份的 chain_id |
| |
| #### 2. latest-backup.json(最新备份软索引) |
| |
| 位于 Dataset 根目录(与 backups/ 同级). |
| |
| ```json5 |
| { |
| "dataset": "GGSheng/page-backup", |
| "latest": "backups/openclaw-backup-20260414-152400-inc-split.tar.gz.meta.json", |
| "is_split": true, |
| "created_at_utc": "2026-04-14T15:24:00" |
| } |
| ``` |
| |
| **注意**:此文件仅存储最新备份的名称,不包含完整信息。完整信息在 `{归档名}.meta.json` 中。 |
| |
| #### 3. 归档内部元数据(last-backup-metadata.json) |
| |
| 在备份过程中,元数据会先写入本地 `work_dir`,然后打包进 tar.gz 归档内部,最后随归档上传到远端。归档内部元数据包含完整的备份信息(与 `.meta.json` 内容一致),用于归档自验证、链合并判断以及下一次增量备份的 parent 确定: |
| |
| ```json5 |
| { |
| "version": "2.1", |
| "backup_type": "incremental", |
| "chain_id": "abc123", |
| "parent": "openclaw-backup-20260414-060000-inc.tar.gz.meta.json", |
| "volumes": ["openclaw-backup-20260414-060000-inc.tar.gz"], |
| "checksum": "sha256:abc123...", |
| "created_at_utc": "2026-04-14T15:24:00", |
| "last_backup_time": "2026-04-14T15:24:00", |
| "file_count": 10, |
| "archive_size": 5242880, |
| "is_latest": true, |
| "created_by": "openclaw-backup" |
| } |
| ``` |
| |
| **说明**: |
| - `volumes`:当前备份包含的所有卷文件(单文件为列表,分卷备份为 `part-*` 文件列表) |
| - `checksum`:归档的 SHA256 校验和 |
| - `file_count`:备份的文件数量 |
| - `archive_size`:归档原始大小(字节) |
| - `is_latest`:标记是否为最新备份(用于链管理) |
| - `last_backup_time`:上一次备份的时间戳(UTC),用于判断增量备份的时间间隔 |
| - 完整的备份信息同时存储在远端的 `.meta.json` 中 |
| |
| **重要说明**:`last-backup-metadata.json` 存储在本地 `work_dir`(默认 `/tmp/openclaw-backup`),容器重启后可能丢失。因此系统设计为**以远程元数据为真理源**:每次备份启动时会从远程 `latest-backup.json` 和最新的 `.meta.json` 获取链信息,确保即使本地元数据丢失也能正确构建备份链。 |
| |
| ### 增量备份链 (Backup Chain) |
| |
| 增量备份通过 `parent` 字段形成链表: |
| |
| **备份链路图**(两条独立链): |
| |
| ``` |
| 【链 abc123】 |
| 完整备份 chain_id: "abc123" |
| openclaw-backup-20260413-120000-full.tar.gz.meta.json |
| parent: null |
| ↑ |
| │ parent |
| │ |
| 增量备份 #1 chain_id: "abc123" |
| openclaw-backup-20260414-060000-inc.tar.gz.meta.json |
| parent: "openclaw-backup-20260413-120000-full.tar.gz.meta.json" |
| ↑ |
| │ parent |
| │ |
| 增量备份 #2 chain_id: "abc123" |
| openclaw-backup-20260414-072200-inc.tar.gz.meta.json |
| parent: "openclaw-backup-20260414-060000-inc.tar.gz.meta.json" |
| ``` |
| |
| ``` |
| 【链 xyz789】(包含最新备份) |
| 完整备份 #1 chain_id: "xyz789" |
| openclaw-backup-20260414-081100-full.tar.gz.meta.json |
| parent: null |
| ↑ |
| │ (独立链,无父子关系) |
| |
| 完整备份 #2 chain_id: "xyz789" |
| openclaw-backup-20260414-090000-full-split.tar.gz.meta.json |
| parent: null |
| ↑ |
| │ parent |
| │ |
| 增量备份 #1 chain_id: "xyz789" |
| openclaw-backup-20260414-150000-inc-split.tar.gz.meta.json |
| parent: "openclaw-backup-20260414-090000-full-split.tar.gz.meta.json" |
| ↑ |
| │ parent |
| │ |
| 增量备份 #2 (最新) chain_id: "xyz789" |
| openclaw-backup-20260414-152400-inc-split.tar.gz.meta.json |
| parent: "openclaw-backup-20260414-150000-inc-split.tar.gz.meta.json" |
| ↑ |
| │ latest |
| │ |
| latest-backup.json → "backups/openclaw-backup-20260414-152400-inc-split.tar.gz.meta.json" |
| ``` |
| |
| **说明**: |
| - `081100-full` 和 `090000-full` 都是完整备份(parent: null),分属同一 chain_id 但互无父子关系 |
| - 同一 chain_id 表示这些备份共享同一个备份链标识,用于分组和校验 |
| - 恢复时沿 `parent` 链路回溯即可 |
| |
| **恢复流程**:从 latest-backup.json 获取最新备份 → 下载其 meta.json → 沿 parent 回溯到完整备份 → 按顺序合并。 |
| |
| ### 远端存储混合场景示例 |
| |
| 假设存在以下远端存储结构(包含完整备份链、多次增量、分卷等): |
| |
| ``` |
| Dataset: GGSheng/page-backup |
| │ |
| ├── latest-backup.json ← 指向最新备份(位于根目录) |
| │ {"latest": "backups/openclaw-backup-20260414-152400-inc-split.tar.gz.meta.json"} |
| │ |
| └── backups/ |
| ├── openclaw-backup-20260413-120000-full.tar.gz.meta.json ← 单文件完整备份1 |
| ├── openclaw-backup-20260413-120000-full.tar.gz |
| ├── openclaw-backup-20260414-060000-inc.tar.gz.meta.json ← 单文件增量备份1 |
| ├── openclaw-backup-20260414-060000-inc.tar.gz |
| ├── openclaw-backup-20260414-072200-inc.tar.gz.meta.json ← 单文件增量备份2 |
| ├── openclaw-backup-20260414-072200-inc.tar.gz |
| ├── openclaw-backup-20260414-081100-full.tar.gz.meta.json ← 单文件完整备份2 |
| ├── openclaw-backup-20260414-081100-full.tar.gz |
| ├── openclaw-backup-20260414-090000-full-split.tar.gz.meta.json ← 分卷完整备份 |
| ├── openclaw-backup-20260414-090000-full-split.tar.gz.part-aa |
| ├── openclaw-backup-20260414-090000-full-split.tar.gz.part-ab |
| ├── openclaw-backup-20260414-150000-inc-split.tar.gz.meta.json ← 分卷增量备份1 |
| ├── openclaw-backup-20260414-150000-inc-split.tar.gz.part-aa |
| ├── openclaw-backup-20260414-150000-inc-split.tar.gz.part-ab |
| ├── openclaw-backup-20260414-152400-inc-split.tar.gz.meta.json ← 分卷增量备份2 |
| ├── openclaw-backup-20260414-152400-inc-split.tar.gz.part-aa |
| └── openclaw-backup-20260414-152400-inc-split.tar.gz.part-ab |
| ``` |
| |
| **场景说明**: |
| |
| | 文件名 | 类型 | chain_id | parent | |
| |--------|------|----------|--------| |
| | 20260413-120000-full | 单文件完整备份 | abc123 | null | |
| | 20260414-060000-inc | 单文件增量备份 | abc123 | 20260413-120000-full.meta.json | |
| | 20260414-072200-inc | 单文件增量备份 | abc123 | 20260414-060000-inc.meta.json | |
| | 20260414-081100-full | 单文件完整备份 | xyz789 | null | |
| | 20260414-090000-full-split | 分卷完整备份 | xyz789 | null | |
| | 20260414-150000-inc-split | 分卷增量备份 | xyz789 | 20260414-090000-full-split.meta.json | |
| | 20260414-152400-inc-split | 分卷增量备份 | xyz789 | 20260414-150000-inc-split.meta.json | |
| |
| **恢复任意备份的流程**: |
| |
| 1. 确定目标备份(如 `openclaw-backup-20260414-090000-full-split.tar.gz`) |
| 2. 下载其 `.meta.json` 获取分卷信息和父备份引用 |
| 3. 如有父备份,继续下载父备份的 `.meta.json` |
| 4. 按顺序合并所有归档 |
| 5. 恢复到目标目录 |
| |
| --- |
| |
| ## 备份流程 |
| |
| ### 整体架构 |
| |
| ``` |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ 外部 Cron 调度 (每30分钟,由cron定义) │ |
| │ OPENCLAW_BACKUP_CRON="*/30 * * * *" │ |
| └─────────────────────────────────────────────────────────────────┘ |
| │ |
| ▼ |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ openclaw-backup-cron.sh │ |
| │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────┐ │ |
| │ │ 环境准备 │→ │ 判断备份类型│→ │ 创建归档 │→ │ 分卷处理│ │ |
| │ │ 加载环境变量│ │ 完整/增量 │ │ (动态策略) │ │ (如需要)│ │ |
| │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────┘ │ |
| │ │ │ |
| │ ▼ │ |
| │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────────┐│ |
| │ │ 清理旧备份 │← │ 上传文件 │← │ 创建meta.json ││ |
| │ │ │ │ 分卷/归档 │ │ 更新latest-backup.json ││ |
| │ └─────────────┘ └─────────────┘ └─────────────────────────────┘│ |
| └─────────────────────────────────────────────────────────────────┘ |
| ``` |
| |
| ### 详细步骤 |
| |
| #### 1. 环境准备 |
| |
| 脚本从以下位置加载环境变量: |
| - `/etc/profile.d/openclaw-env.sh` |
| - `/root/.env.d/openclaw-backup.env` |
| |
| 关键环境变量: |
| | 变量 | 默认值 | 说明 | |
| |------|--------|------| |
| | `OPENCLAW_BACKUP_DATASET_REPO` | (必填) | 备份存储的Dataset仓库 | |
| | `OPENCLAW_BACKUP_SOURCE_DIR` | `/root/.openclaw` | 备份源目录 | |
| | `OPENCLAW_BACKUP_WORK_DIR` | `/tmp/openclaw-backup` | 临时工作目录 | |
| | `OPENCLAW_BACKUP_SPLIT_SIZE` | `500` | 分卷大小(MB),超过此大小则分卷 | |
| | `OPENCLAW_BACKUP_KEEP_COUNT` | `24` | 保留的备份数量 | |
| | `OPENCLAW_INCREMENTAL_BACKUP` | `true` | 是否启用增量备份功能(`true`=增量模式,`false`=每次完整备份) | |
| | `OPENCLAW_INCREMENTAL_INTERVAL_MINUTES` | `15` | 增量备份间隔(分钟),距上次备份>=此时间时执行增量备份 | |
| | `OPENCLAW_BACKUP_ENCRYPTION_ENABLED` | `false` | 是否启用加密备份(`true`=启用,`false`=禁用) | |
| | `OPENCLAW_BACKUP_ENCRYPTION_PASSWORD` | (必填) | 加密密码(建议使用 HF Space Secrets 管理) | |
| |
| **注意**:`OPENCLAW_INCREMENTAL_INTERVAL_MINUTES` 控制的是"每隔多久执行增量备份"。外部 cron 每30分钟触发一次,如果距上次备份时间未超过此间隔,则不执行任何备份;如果超过此间隔,则执行增量备份。 |
| |
| #### 2. 获取远程链信息(真理源) |
| |
| **关键设计**:由于本地 `last-backup-metadata.json` 存储在 `/tmp/openclaw-backup`(容器重启后可能丢失),系统以**远程元数据为真理源**。 |
| |
| ``` |
| 备份启动时: |
| │ |
| ├─→ 下载 latest-backup.json |
| │ |
| └─→ 下载 latest .meta.json |
| │ |
| ├─→ last_backup_time → 用于判断距上次备份的时间间隔 |
| ├─→ parent_meta_path → 用于增量备份链的 parent 引用 |
| ├─→ chain_id → 用于标识同一备份链 |
| └─→ volumes → 用于分卷备份的文件列表 |
| ``` |
| |
| 此步骤确保即使本地元数据丢失,也能正确构建备份链。 |
| |
| #### 3. 动态备份策略 |
| |
| 系统根据文件变化率和预估大小自动调整备份参数: |
| |
| ``` |
| 预估大小判断: |
| ├── < 500MB → 小文件,快压缩(级别3),单文件备份 |
| ├── 500MB-2GB → 中等文件,平衡压缩(级别6),单文件备份 |
| └── > 2GB → 大文件,最大压缩(级别9) |
| |
| 变化率调整: |
| ├── > 10文件/分钟 → 高变化率,降低压缩级别优先速度 |
| └── < 2文件/分钟 → 低变化率,提高压缩级别,可能跳过备份 |
| ``` |
| |
| **分卷触发**:归档完成后,检查归档实际大小是否超过 `OPENCLAW_BACKUP_SPLIT_SIZE`(默认500MB)。如果超过,则自动启用分卷。预估大小 `> 2GB` 只是提示系统采用最大压缩级别,但最终是否分卷由实际大小决定。 |
| |
| #### 4. 判断备份类型 |
| |
| 根据以下条件判断是否执行完整备份或增量备份: |
| |
| ``` |
| INCREMENTAL_BACKUP = false? |
| ├── YES → 执行完整备份(每次都备份所有文件,创建新链) |
| └── NO ↓ |
| 首次备份 或 手动触发完整备份? |
| ├── YES → 执行完整备份(备份所有文件,创建新链) |
| └── NO ↓ |
| |
| 距上次备份时间 >= INCREMENTAL_INTERVAL ? |
| ├── YES → 执行增量备份(只打包变化的文件,继承链) |
| └── NO → 跳过本次备份(等待下次 cron 触发) |
| ``` |
| |
| **说明**: |
| - 完整备份:创建新的备份链,生成新的 `chain_id` |
| - 增量备份:继承父备份的 `chain_id` |
| - 外部 cron 每30分钟触发一次,只有满足时间条件才执行增量备份 |
| - `OPENCLAW_INCREMENTAL_BACKUP=false` 时,每次都执行完整备份 |
| - 归档完成后检查大小,超过 `OPENCLAW_BACKUP_SPLIT_SIZE` 则自动分卷 |
| |
| #### 5. 创建归档 |
| |
| 根据策略创建 tar.gz 归档: |
| |
| **完整备份归档结构**: |
| ``` |
| openclaw-backup-20260414-120000-full.tar.gz |
| ├── openclaw-state/ # 主状态目录 |
| │ ├── config.json |
| │ └── ... |
| ├── root-config/ # 额外目录 |
| ├── root-ssh/ # SSH配置 |
| └── last-backup-metadata.json # 备份链元数据 |
| ``` |
| |
| **增量备份归档结构**: |
| ``` |
| openclaw-backup-20260414-130000-inc.tar.gz |
| ├── openclaw-state/ # 仅变化的文件 |
| │ └── changed-file.txt |
| ├── root-config/ # 仅变化的额外目录 |
| └── last-backup-metadata.json # 包含chain_id和parent信息 |
| ``` |
| |
| **分卷备份归档结构**: |
| 分卷备份的原始 tar.gz 归档结构与完整/增量相同,只是在归档完成后被分割成多个 part 文件。 |
| |
| ``` |
| # 归档前的原始结构(分卷完整备份为例) |
| openclaw-backup-20260414-090000-full.tar.gz |
| ├── openclaw-state/ # 主状态目录 |
| ├── root-config/ |
| ├── root-ssh/ |
| └── last-backup-metadata.json # 分卷前记录原始归档名 |
| |
| # 分割后的文件 |
| openclaw-backup-20260414-090000-full.tar.gz.part-aa # 包含 last-backup-metadata.json |
| openclaw-backup-20204-090000-full.tar.gz.part-ab # 包含 last-backup-metadata.json |
| ``` |
| |
| **分卷文件中 `last-backup-metadata.json` 的内容说明**: |
| |
| 每个分卷 part 文件内部都包含相同的 `last-backup-metadata.json`(因为是通过 split 分割原始 tar.gz 得到的)。其内容在分卷**前**确定: |
| |
| ```json5 |
| { |
| "version": "2.1", |
| "backup_type": "full", |
| "chain_id": "xyz789", |
| "parent": null, |
| "volumes": ["openclaw-backup-20260414-090000-full.tar.gz"], // 分卷前为原始归档名 |
| "checksum": "sha256:abc123...", |
| "created_at_utc": "2026-04-14T09:00:00", |
| "last_backup_time": "2026-04-14T09:00:00", |
| "file_count": 128, |
| "archive_size": 2147483648, |
| "is_latest": true, |
| "created_by": "openclaw-backup" |
| } |
| ``` |
| |
| **注意**:`volumes` 字段在分卷前记录原始归档名,分卷后通过 `_update_volumes_in_metadata()` 更新为本地的 `last-backup-metadata.json`,但**分卷文件内部的副本不会改变**。上传到远端的 `.meta.json` 文件由程序生成,其 `volumes` 字段会被更新为所有分卷文件名列表: |
| ```json5 |
| { |
| "volumes": [ |
| "openclaw-backup-20260414-090000-full.tar.gz.part-aa", |
| "openclaw-backup-20260414-090000-full.tar.gz.part-ab" |
| ], |
| ... |
| } |
| ``` |
| |
| #### 6. 分卷处理(如需要) |
| |
| 当备份超过 `OPENCLAW_BACKUP_SPLIT_SIZE`(默认500MB)时,使用 `split` 命令分割: |
| |
| ```bash |
| split -b 500M openclaw-backup-20260414-120000-full.tar.gz \ |
| openclaw-backup-20260414-120000-full.tar.gz.part- |
| # 生成: part-aa, part-ab, part-ac... |
| ``` |
| |
| #### 7. 上传到 HuggingFace |
| |
| **分卷备份上传**: |
| 1. 所有分卷卷文件(如 `openclaw-backup-20260414-090000-full-split.tar.gz.part-aa`, `openclaw-backup-20260414-090000-full-split.tar.gz.part-ab`) |
| 2. `.meta.json` 元数据索引文件 |
| |
| **单文件备份上传**: |
| 1. `.tar.gz` 归档文件 |
| 2. `.meta.json` 元数据索引文件 |
| |
| **统一上传顺序**: |
| 1. 所有分卷卷文件(如有) |
| 2. 单文件归档文件(如有) |
| 3. `.meta.json` 元数据索引文件 |
| 4. 更新 `latest-backup.json` 指向最新备份的 `.meta.json` |
| |
| **注意**:`latest-backup.json` 存储最新备份的 `.meta.json` 路径(如 `backups/openclaw-backup-20260414-152400-inc-split.tar.gz.meta.json`),恢复时直接下载它获取完整信息。 |
| |
| #### 8. 清理旧备份 |
| |
| 保留最近 `OPENCLAW_BACKUP_KEEP_COUNT`(默认24)个备份: |
| |
| ```python |
| # 清理逻辑(简化) |
| all_backups = list_repo_files() # 获取所有备份文件 |
| valid_backups = [] |
| for backup in sorted(all_backups, key=timestamp, reverse=True): |
| if "-split.tar.gz" in backup: |
| # 分卷备份:检查所有卷是否都存在 |
| if all_volumes_exist(backup): |
| valid_backups.append(backup) |
| elif backup.endswith(".meta.json"): |
| # 元数据文件:分卷备份直接保留(.tar.gz已被分割) |
| # 单文件备份需检查归档是否存在 |
| main_archive = backup.replace(".meta.json", "") |
| if "-split" in main_archive or archive_exists(main_archive): |
| valid_backups.append(backup) |
| else: |
| valid_backups.append(backup) |
| |
| # 删除超过keep_count的旧备份 |
| for backup in valid_backups[keep_count:]: |
| delete(backup) |
| ``` |
| |
| --- |
| |
| ## 还原流程 |
| |
| ### 整体流程 |
| |
| ``` |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ openclaw-restore.sh │ |
| │ │ |
| │ ┌──────────────┐ ┌──────────────┐ ┌────────────────────────┐ │ |
| │ │ 下载索引 │→ │ 解析备份链 │→ │ 按顺序下载归档 │ │ |
| │ │ latest.json │ │ 下载meta.json│ │ (从完整到最新) │ │ |
| │ │ │ │ 沿parent回溯 │ │ 合并分卷(如需要) │ │ |
| │ └──────────────┘ └──────────────┘ └────────────────────────┘ │ |
| │ │ │ |
| │ ▼ │ |
| │ ┌────────────────────────┐ │ |
| │ │ 恢复到目标目录 │ │ |
| │ │ │ │ |
| │ └────────────────────────┘ │ |
| └─────────────────────────────────────────────────────────────────┘ |
| ``` |
| |
| ### 详细步骤 |
| |
| #### 1. 下载备份索引 |
| |
| 从 HuggingFace Dataset 下载 `latest-backup.json`,获取最新备份的 `.meta.json` 路径: |
| ```json |
| { |
| "latest": "backups/openclaw-backup-20260414-152400-inc-split.tar.gz.meta.json", |
| "created_at_utc": "2026-04-14T15:24:00" |
| } |
| ``` |
| |
| #### 2. 解析备份链(下载所有 meta.json) |
| |
| 从最新备份开始,沿 `parent` 字段回溯到完整备份,下载所有需要的 `.meta.json`: |
| |
| ```bash |
| # 回溯下载流程(从新到旧收集) |
| hf download GGSheng/page-backup backups/openclaw-backup-20260414-152400-inc-split.tar.gz.meta.json |
| # → parent: "openclaw-backup-20260414-060000-inc.tar.gz.meta.json" |
| |
| hf download GGSheng/page-backup backups/openclaw-backup-20260414-060000-inc.tar.gz.meta.json |
| # → parent: "openclaw-backup-20260413-120000-full.tar.gz.meta.json" |
| |
| hf download GGSheng/page-backup backups/openclaw-backup-20260413-120000-full.tar.gz.meta.json |
| # → parent: null (完整备份) |
| ``` |
| |
| 回溯完成后,收集到的 meta.json 列表需要反转,得到完整备份链(从旧到新): |
| ``` |
| 合并顺序: [20260413-120000(full), 20260414-060000(inc), 20260414-152400(inc-split)] |
| ``` |
| |
| #### 3. 按顺序下载归档文件 |
| |
| 根据备份链**从旧到新**依次下载归档并合并: |
| |
| - **分卷备份**:下载所有分卷卷文件(如 `part-aa`, `part-ab`),然后合并 |
| - **单文件备份**:直接下载 `.tar.gz` 文件 |
| |
| ```python |
| # 按顺序下载并合并归档 |
| chain = [full_meta, inc1_meta, inc2_meta] # 从完整备份到最新 |
| merged = {} |
| |
| for meta in chain: |
| # 下载归档(分卷则先合并) |
| # volumes 格式: ["backups/xxx.part-aa", "backups/xxx.part-ab"] 或 ["backups/xxx.tar.gz"] |
| volumes = meta["volumes"] |
| |
| if meta["is_split"]: |
| # 下载所有分卷,合并为一个临时文件 |
| with open("temp.tar.gz", "wb") as out: |
| for vol in sorted(volumes): |
| download(vol) |
| with open(vol, "rb") as inp: |
| out.write(inp.read()) |
| archive = "temp.tar.gz" |
| else: |
| archive = volumes[0] |
| download(archive) |
| |
| # 校验完整性 |
| assert calculate_checksum(archive) == meta["checksum"] |
| |
| # 提取并合并到merged |
| for file in extract(archive): |
| merged[file.path] = file # 后者覆盖前者 |
| ``` |
| |
| #### 4. 恢复文件 |
| |
| 将合并后的文件恢复到配置的目标目录: |
| |
| | 归档内容 | 恢复路径 | |
| |----------|----------| |
| | `openclaw-state/` | `OPENCLAW_BACKUP_SOURCE_DIR` | |
| | `root-config/` | `OPENCLAW_BACKUP_ROOT_CONFIG_DIR` | |
| | `root-ssh/` | `OPENCLAW_BACKUP_ROOT_SSH_DIR` | |
| | 其他 | 按配置恢复 | |
| |
| #### 5. 指定备份恢复 |
| |
| 恢复时可以指定任意备份,系统会自动完成备份链回溯和合并。 |
| |
| **通过环境变量指定**: |
| ```bash |
| # 指定要恢复的归档文件名(不含路径,包含.tar.gz扩展名) |
| # 系统会自动添加 .meta.json 后缀 |
| export OPENCLAW_RESTORE_ARCHIVE="openclaw-backup-20260414-090000-full-split.tar.gz" |
| |
| # 执行恢复 |
| python3 /opt/openclaw-hf/openclaw_hf/backup.py restore |
| ``` |
| |
| **恢复流程**: |
| 1. 系统根据 `OPENCLAW_BACKUP_PATH_PREFIX` 构造元数据文件路径 |
| 2. 解析备份链:下载 `.meta.json` 并沿 `parent` 回溯到完整备份 |
| 3. 按顺序下载归档并合并(从完整备份到最新) |
| 4. **如果归档加密**(`encrypted: true`),使用 `OPENCLAW_BACKUP_ENCRYPTION_PASSWORD` 解密 |
| 5. 恢复到目标目录 |
| |
| **示例**:假设指定恢复 `openclaw-backup-20260414-090000-full-split.tar.gz`(一个分卷完整备份): |
| |
| ``` |
| 指定归档 → openclaw-backup-20260414-090000-full-split.tar.gz |
| ↓ 系统添加后缀 → openclaw-backup-20260414-090000-full-split.tar.gz.meta.json |
| ↓ 下载 .meta.json: is_split=true, volumes=["backups/...part-aa", "backups/...part-ab"] |
| ↓ parent: null (完整备份,无需回溯) |
| ↓ |
| Step 3: |
| (1)下载所有分卷: part-aa, part-ab |
| (2)合并分卷为完整归档 |
| (3)如果 encrypted=true,解密归档 |
| Step 4: 提取文件到目标目录 |
| ``` |
| |
| **加密恢复注意事项**: |
| - 如果备份是加密的但未提供 `OPENCLAW_BACKUP_ENCRYPTION_PASSWORD`,恢复会失败并报错 |
| - 加密备份与普通备份可以共存于同一 Dataset 中,恢复时自动识别 |
| |
| --- |
| |
| ## 备份调度 |
| |
| ### Cron 配置 |
| |
| 默认每30分钟执行一次备份检查: |
| |
| ```bash |
| OPENCLAW_BACKUP_CRON="*/30 * * * *" |
| ``` |
| |
| ### 调度执行 |
| |
| 由 `openclaw-entrypoint.sh` 启动: |
| ```bash |
| # 在容器启动时设置cron |
| echo "$OPENCLAW_BACKUP_CRON root /usr/local/bin/openclaw-backup-cron.sh" >> /etc/crontab |
| ``` |
| |
| ### 执行保障 |
| |
| - **健康检查**:备份前可选执行 `--check`,失败时尝试 `--repair` |
| - **重试机制**:默认3次重试,间隔递增(10s, 20s, 30s) |
| - **看门狗**:作为最后防线,确保备份按时执行 |
| |
| ### 并发执行控制 |
| |
| 备份使用文件锁机制防止并发执行: |
| |
| ```python |
| lock_path = "/tmp/openclaw-backup/openclaw-backup.lock" |
| lock_fd = open(lock_path, "w") |
| try: |
| fcntl.flock(lock_fd, fcntl.LOCK_EX | fcntl.LOCK_NB) # 非阻塞获取锁 |
| except (OSError, IOError): |
| print("backup skipped: another backup is already in progress") |
| return None |
| ``` |
| |
| **流程**: |
| 1. 尝试获取排他锁(`LOCK_EX | LOCK_NB`) |
| 2. 如果锁被占用,跳过本次备份 |
| 3. 备份完成后释放锁 |
| |
| **注意**:如果备份任务执行时间超过 cron 间隔(例如30分钟),新的备份任务会跳过,确保不会同时执行两个备份。 |
| |
| --- |
| |
| ## 环境变量参考 |
| |
| ### 必需变量 |
| |
| | 变量 | 说明 | |
| |------|------| |
| | `OPENCLAW_BACKUP_DATASET_REPO` | HuggingFace Dataset 仓库ID (如 `GGSheng/page-backup`) | |
| |
| ### 备份配置 |
| |
| | 变量 | 默认值 | 说明 | |
| |------|--------|------| |
| | `OPENCLAW_BACKUP_SOURCE_DIR` | `/root/.openclaw` | 备份的源目录 | |
| | `OPENCLAW_BACKUP_WORK_DIR` | `/tmp/openclaw-backup` | 临时工作目录 | |
| | `OPENCLAW_BACKUP_PATH_PREFIX` | `backups` | 仓库内的路径前缀 | |
| | `OPENCLAW_BACKUP_PRIVATE` | `true` | 是否创建为私有仓库 | |
| |
| ### 增量备份 |
| |
| | 变量 | 默认值 | 说明 | |
| |------|--------|------| |
| | `OPENCLAW_INCREMENTAL_BACKUP` | `true` | 启用增量备份 | |
| | `OPENCLAW_INCREMENTAL_INTERVAL_MINUTES` | `15` | 增量备份间隔(分钟),每隔此时间执行一次增量备份 | |
| |
| **注意**:此变量控制增量备份的执行频率,非完整备份间隔。外部 cron 每30分钟触发一次,如果距上次备份未超过此间隔则跳过。超过此间隔则执行增量备份。 |
| |
| ### 性能调优 |
| |
| | 变量 | 默认值 | 说明 | |
| |------|--------|------| |
| | `OPENCLAW_BACKUP_COMPRESSION_LEVEL` | `6` | 压缩级别 (1-9,9最大) | |
| | `OPENCLAW_BACKUP_SPLIT_SIZE` | `500M` | 分卷大小,空=不分卷 | |
| | `OPENCLAW_BACKUP_SIZE_WARNING_MB` | `1500` | 备份大小警告阈值(MB) | |
| | `OPENCLAW_BACKUP_KEEP_COUNT` | `24` | 保留备份数量 | |
| |
| ### 动态备份策略 |
| |
| | 变量 | 默认值 | 说明 | |
| |------|--------|------| |
| | `OPENCLAW_DYNAMIC_BACKUP` | `true` | 启用动态策略 | |
| | `OPENCLAW_DYNAMIC_SMALL_THRESHOLD_MB` | `500` | 小文件阈值 | |
| | `OPENCLAW_DYNAMIC_MEDIUM_THRESHOLD_MB` | `2000` | 中等文件阈值 | |
| | `OPENCLAW_DYNAMIC_HIGH_CHANGE_RATE` | `10` | 高变化率阈值(文件/分钟) | |
| | `OPENCLAW_DYNAMIC_LOW_CHANGE_RATE` | `2` | 低变化率阈值(文件/分钟) | |
| |
| ### 额外目录和文件 |
| |
| 通过环境变量配置额外的备份内容: |
| |
| ```bash |
| # 额外目录格式: "归档名:/路径" |
| OPENCLAW_BACKUP_EXTRA_DIRS="root-config:/root/.config,root-ssh:/root/.ssh" |
| |
| # 额外文件格式: "归档名:/路径" |
| OPENCLAW_BACKUP_EXTRA_FILES="root-bashrc:/root/.bashrc" |
| ``` |
| |
| ### 健康检查 |
| |
| | 变量 | 默认值 | 说明 | |
| |------|--------|------| |
| | `OPENCLAW_BACKUP_HEALTH_CHECK_ENABLED` | `false` | 启用健康检查 | |
| | `OPENCLAW_BACKUP_HEALTH_CHECK_BEFORE` | `false` | 备份前检查 | |
| | `OPENCLAW_BACKUP_HEALTH_CHECK_AFTER` | `false` | 备份后检查 | |
| | `OPENCLAW_BACKUP_MAX_RETRIES` | `3` | 最大重试次数 | |
| |
| ### 恢复配置 |
| |
| | 变量 | 默认值 | 说明 | |
| |------|--------|------| |
| | `OPENCLAW_RESTORE_ARCHIVE` | (空) | 指定恢复的归档文件名(不含路径),如 `openclaw-backup-20260414-090000.tar.gz`。默认为空时恢复最新备份。 | |
| |
| **注意**:恢复时系统会自动查找 `{OPENCLAW_RESTORE_ARCHIVE}.meta.json` 获取备份的完整信息(volumes分卷列表、parent链路上游、chain_id等),无需单独指定元数据文件。 |
| |
| --- |
| |
| ## 使用示例 |
| |
| ### 首次部署 |
| |
| ```bash |
| # 在 bootstrap-hf.sh 中配置 |
| export OPENCLAW_BACKUP_DATASET_REPO="GGSheng/page-backup" |
| export OPENCLAW_INCREMENTAL_BACKUP="true" |
| export OPENCLAW_BACKUP_KEEP_COUNT="24" |
| ``` |
| |
| ### 手动触发备份 |
| |
| ```bash |
| python3 /opt/openclaw-hf/openclaw_hf/backup.py backup |
| ``` |
| |
| ### 手动触发恢复 |
| |
| ```bash |
| python3 /opt/openclaw-hf/openclaw_hf/backup.py restore |
| ``` |
| |
| **注意**:`backup.py` 和 `restore` 操作均通过环境变量配置,不支持命令行参数指定备份文件或其他选项。相关配置通过 `OPENCLAW_RESTORE_*` 系列环境变量指定(详见环境变量参考)。 |
| |
| ### 查看备份状态 |
| |
| ```bash |
| # 查看最新备份索引 |
| hf download GGSheng/page-backup backups/latest-backup.json |
| cat backups/latest-backup.json |
| ``` |
| |
| --- |
| |
| ## 故障排除 |
| |
| ### 备份失败 |
| |
| 1. 检查网络连接和 HF_TOKEN 权限 |
| 2. 查看日志:`/var/log/openclaw/backup.log` |
| 3. 确认 Dataset 仓库存在且有写入权限 |
| |
| ### 恢复失败 |
| |
| 1. 确认备份元数据完整 |
| 2. 检查恢复目标目录有足够空间 |
| 3. 元数据版本不兼容时,系统会以兼容模式尝试恢复 |
| |
| ### 大文件备份超时 |
| |
| - 启用分卷备份:`OPENCLAW_BACKUP_SPLIT_SIZE="500M"` |
| - 增加 HuggingFace 下载超时:`HF_HUB_DOWNLOAD_TIMEOUT=300` |
| |
| --- |
| |
| ## 元数据版本兼容性 |
| |
| | 版本 | 说明 | |
| |------|------| |
| | 1.0 | 初始版本 | |
| | 2.0 | 支持增量备份链 | |
| | 2.1 | 增强兼容性检查 | |
| |
| 系统支持向后兼容,降级使用时会有警告但不影响基本功能。 |
| |