Farhan Beg commited on
Commit
b047e12
·
1 Parent(s): ff3dadd

feat(sync): warn when keys live in ephemeral .env on a Space

Browse files

The Hermes dashboard's Env tab writes to $HERMES_HOME/.env which is
intentionally excluded from backup. On HF Spaces the filesystem is
ephemeral, so anything saved there silently disappears on the next
restart and the gateway 500s with 'Provider X is set in config.yaml
but no API key was found'.

* hermes-sync.py: env_warning_payload() detects populated .env on a
Space when SYNC_INCLUDE_ENV is off, surfaces it via write_status() so
every status snapshot carries a structured warning, and prints once
at loop start so it's grep-able in Space logs.
* health-server.js: Backup tile on the HuggingMes status page renders
the warning as a yellow callout when present, and flips the tile
tone to warn so it's visible at a glance.
* README + .env.example: explicit guidance to put provider keys in HF
Space Secrets, with SYNC_INCLUDE_ENV=1 documented as the opt-in
weaker-security alternative. Also fixed stale '10 minute' backup
copy left over from the fixed-interval era.

False-positive guard: returns None when SPACE_ID/SPACE_HOST is unset,
so local dev installs and Docker hosts that legitimately use .env
stay quiet.

Files changed (4) hide show
  1. .env.example +5 -0
  2. README.md +23 -1
  3. health-server.js +9 -3
  4. hermes-sync.py +60 -1
.env.example CHANGED
@@ -20,6 +20,11 @@ LLM_MODEL=openrouter/anthropic/claude-sonnet-4
20
  # SYNC_POLL_INTERVAL=2
21
  # SYNC_DEBOUNCE_SECONDS=3
22
  # SYNC_INTERVAL=60
 
 
 
 
 
23
 
24
  # ── Optional: Cloudflare proxy + keep-alive ───────────────────────────
25
  # CLOUDFLARE_WORKERS_TOKEN=cf_xxx
 
20
  # SYNC_POLL_INTERVAL=2
21
  # SYNC_DEBOUNCE_SECONDS=3
22
  # SYNC_INTERVAL=60
23
+ # Back up $HERMES_HOME/.env (provider keys typed via the dashboard's Env tab)
24
+ # into the private dataset. OFF by default — see README "Configure LLM
25
+ # Provider via Config Editor" for the security tradeoff. Prefer adding keys
26
+ # as HF Space Secrets, which never touch disk.
27
+ # SYNC_INCLUDE_ENV=1
28
 
29
  # ── Optional: Cloudflare proxy + keep-alive ───────────────────────────
30
  # CLOUDFLARE_WORKERS_TOKEN=cf_xxx
README.md CHANGED
@@ -82,7 +82,7 @@ Open Hermes Dashboard from here (`https://f4b404-hermes.hf.space/hm/app`)
82
  ## Your Data Is Safe
83
 
84
  When `HF_TOKEN` is set:
85
- - All your chats, files, settings, and agent memory are backed up to a **private** Hugging Face Dataset every 10 minutes
86
  - If the Space restarts, everything comes back exactly as you left it
87
 
88
  ---
@@ -122,6 +122,28 @@ Use the same (`https://your-name.hf.space`) url in android and then you can inst
122
 
123
  ## Configure LLM Provider via Config Editor
124
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
125
  If you prefer not to add API keys as HF Secrets, you can configure providers directly in Hermes after the Space starts:
126
 
127
  1. Open `/hm/app/config` in your Space
 
82
  ## Your Data Is Safe
83
 
84
  When `HF_TOKEN` is set:
85
+ - All your chats, files, settings, and agent memory are backed up to a **private** Hugging Face Dataset within seconds of each change (change-driven, capped at 60 s)
86
  - If the Space restarts, everything comes back exactly as you left it
87
 
88
  ---
 
122
 
123
  ## Configure LLM Provider via Config Editor
124
 
125
+ > ### ⚠️ Provider keys go in HF Space Secrets, not the dashboard's Env tab
126
+ >
127
+ > The Hermes dashboard exposes an "Env" editor that writes to `/opt/data/.env`
128
+ > inside the container. **That file is *not* backed up to your HF Dataset.**
129
+ > On every Space sleep / rebuild the container's filesystem is wiped, the
130
+ > `.env` is gone, and your `OLLAMA_API_KEY` / `OPENROUTER_API_KEY` /
131
+ > `ANTHROPIC_API_KEY` / etc. disappear with it. The Space then 500s on the
132
+ > first chat with `Provider 'X' is set in config.yaml but no API key was
133
+ > found`.
134
+ >
135
+ > **Always add provider keys as HF Space Secrets** (Settings → Variables and
136
+ > secrets → New secret). HF injects them as env vars at boot, never writes
137
+ > them to disk on the Space, and they survive every restart.
138
+ >
139
+ > Use the dashboard's Env tab only for non-secret tweaks. The status page's
140
+ > Backup tile will show a yellow warning whenever it detects keys sitting in
141
+ > the ephemeral `.env` so you don't have to remember this on your own.
142
+ >
143
+ > If you accept the security tradeoff and want `.env` backed up anyway, set
144
+ > `SYNC_INCLUDE_ENV=1` as a Space Variable. The dataset is private, but a
145
+ > leak of that dataset URL is then a leak of every key in `.env`.
146
+
147
  If you prefer not to add API keys as HF Secrets, you can configure providers directly in Hermes after the Space starts:
148
 
149
  1. Open `/hm/app/config` in your Space
health-server.js CHANGED
@@ -554,6 +554,11 @@ function renderStatusPage(data) {
554
  const backupDetail = data.backup?.message
555
  ? escapeHtml(data.backup.message)
556
  : "No status yet";
 
 
 
 
 
557
  const keepAliveDetail = keepaliveConfigured
558
  ? `Pinging <code>${escapeHtml(data.keepalive.targetUrl || "/health")}</code>`
559
  : keepaliveStatus === "error" && data.keepalive?.message
@@ -596,9 +601,9 @@ function renderStatusPage(data) {
596
  }),
597
  renderTile({
598
  title: "Backup",
599
- value: toneBadge(syncStatus.toUpperCase(), syncTone),
600
- detail: backupDetail,
601
- tone: syncTone,
602
  meta: data.backup?.timestamp
603
  ? `<span class="local-time" data-iso="${data.backup.timestamp}"></span>`
604
  : "",
@@ -646,6 +651,7 @@ function renderStatusPage(data) {
646
  .tile-value { font-size:1.12rem; font-weight:850; overflow-wrap:anywhere; }
647
  .tile-detail { color:var(--soft); line-height:1.45; font-size:.83rem; }
648
  .tile-meta { color:var(--muted); line-height:1.4; font-size:.75rem; margin-top:auto; overflow-wrap:anywhere; }
 
649
  code { background:#232234; border:1px solid #34324c; border-radius:6px; padding:2px 6px; color:var(--text); font-size:.9em; }
650
  .badge { display:inline-flex; align-items:center; border:1px solid var(--line); border-radius:999px; padding:5px 10px; font-size:.72rem; font-weight:850; line-height:1; text-transform:uppercase; }
651
  .badge.ok { color:var(--good); border-color:rgba(34,197,94,.34); background:rgba(34,197,94,.11); }
 
554
  const backupDetail = data.backup?.message
555
  ? escapeHtml(data.backup.message)
556
  : "No status yet";
557
+ // Extra one-line warning row for known-loud failure modes (currently:
558
+ // ephemeral .env on a Space). hermes-sync.py emits this via warning.message.
559
+ const backupWarning = data.backup?.warning?.message
560
+ ? `<div class="tile-warning">${escapeHtml(data.backup.warning.message)}</div>`
561
+ : "";
562
  const keepAliveDetail = keepaliveConfigured
563
  ? `Pinging <code>${escapeHtml(data.keepalive.targetUrl || "/health")}</code>`
564
  : keepaliveStatus === "error" && data.keepalive?.message
 
601
  }),
602
  renderTile({
603
  title: "Backup",
604
+ value: toneBadge(syncStatus.toUpperCase(), data.backup?.warning ? "warn" : syncTone),
605
+ detail: backupDetail + backupWarning,
606
+ tone: data.backup?.warning ? "warn" : syncTone,
607
  meta: data.backup?.timestamp
608
  ? `<span class="local-time" data-iso="${data.backup.timestamp}"></span>`
609
  : "",
 
651
  .tile-value { font-size:1.12rem; font-weight:850; overflow-wrap:anywhere; }
652
  .tile-detail { color:var(--soft); line-height:1.45; font-size:.83rem; }
653
  .tile-meta { color:var(--muted); line-height:1.4; font-size:.75rem; margin-top:auto; overflow-wrap:anywhere; }
654
+ .tile-warning { color:#fde68a; background:rgba(245,158,11,.08); border:1px solid rgba(245,158,11,.32); border-radius:6px; padding:6px 8px; margin-top:6px; font-size:.78rem; line-height:1.4; }
655
  code { background:#232234; border:1px solid #34324c; border-radius:6px; padding:2px 6px; color:var(--text); font-size:.9em; }
656
  .badge { display:inline-flex; align-items:center; border:1px solid var(--line); border-radius:999px; padding:5px 10px; font-size:.72rem; font-weight:850; line-height:1; text-transform:uppercase; }
657
  .badge.ok { color:var(--good); border-color:rgba(34,197,94,.34); background:rgba(34,197,94,.11); }
hermes-sync.py CHANGED
@@ -71,10 +71,64 @@ HF_API = HfApi(token=HF_TOKEN) if HF_TOKEN else None
71
  STOP_EVENT = threading.Event()
72
  _REPO_ID_CACHE: str | None = None
73
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
74
 
75
  def write_status(status: str, message: str, fingerprint: str | None = None, marker: tuple[int, int, int] | None = None) -> None:
76
  timestamp = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())
77
- payload = {"status": status, "message": message, "timestamp": timestamp}
 
 
 
78
 
79
  tmp_path = STATUS_FILE.with_suffix(".tmp")
80
  try:
@@ -311,6 +365,11 @@ def loop() -> int:
311
  print(f"Hermes sync error: {exc}")
312
  return 1
313
 
 
 
 
 
 
314
  # Seed from any prior run so we don't re-upload an identical tree.
315
  last_fingerprint: str | None = None
316
  last_marker: tuple[int, int, int] | None = None
 
71
  STOP_EVENT = threading.Event()
72
  _REPO_ID_CACHE: str | None = None
73
 
74
+ # `.env` warning: on HF Spaces, the dashboard's "Env" tab writes to
75
+ # $HERMES_HOME/.env which is *not* backed up by default (see EXCLUDED_TOP_LEVEL
76
+ # above). That means provider keys typed into the dashboard silently disappear
77
+ # on every restart. We can't safely fix that by default — uploading plaintext
78
+ # secrets to a dataset is the wrong tradeoff — but we can make the failure
79
+ # loud. The status surface on the HuggingMes status page reads the JSON below,
80
+ # so an `env_warning` field renders as a banner without any extra plumbing.
81
+ ENV_FILE = HERMES_HOME / ".env"
82
+ ON_HF_SPACE = bool(os.environ.get("SPACE_ID") or os.environ.get("SPACE_HOST"))
83
+
84
+
85
+ def env_warning_payload() -> dict | None:
86
+ """Detect plaintext-secret-loss risk and return a warning blob, or None.
87
+
88
+ Fires when:
89
+ * we're on an HF Space (ephemeral filesystem), AND
90
+ * `.env` exists with non-trivial content, AND
91
+ * SYNC_INCLUDE_ENV is off (so .env is NOT being backed up).
92
+
93
+ The warning is informational. We never refuse to start sync, and we never
94
+ auto-flip SYNC_INCLUDE_ENV — the user must opt in to backing up plaintext.
95
+ """
96
+ if not ON_HF_SPACE or INCLUDE_ENV:
97
+ return None
98
+ try:
99
+ if not ENV_FILE.is_file():
100
+ return None
101
+ # Count non-empty, non-comment lines as a proxy for "user-set keys".
102
+ keys = 0
103
+ for raw in ENV_FILE.read_text(encoding="utf-8", errors="replace").splitlines():
104
+ line = raw.strip()
105
+ if not line or line.startswith("#"):
106
+ continue
107
+ if "=" in line:
108
+ keys += 1
109
+ if keys <= 0:
110
+ return None
111
+ return {
112
+ "kind": "ephemeral_env",
113
+ "keys": keys,
114
+ "message": (
115
+ f"{keys} entr{'y' if keys == 1 else 'ies'} in $HERMES_HOME/.env "
116
+ "will be wiped on the next Space restart. Move secrets to "
117
+ "Space Secrets (Settings -> Variables and secrets), or set "
118
+ "SYNC_INCLUDE_ENV=1 to back up .env to the private dataset "
119
+ "(plaintext; weaker security)."
120
+ ),
121
+ }
122
+ except OSError:
123
+ return None
124
+
125
 
126
  def write_status(status: str, message: str, fingerprint: str | None = None, marker: tuple[int, int, int] | None = None) -> None:
127
  timestamp = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())
128
+ payload: dict = {"status": status, "message": message, "timestamp": timestamp}
129
+ warning = env_warning_payload()
130
+ if warning is not None:
131
+ payload["warning"] = warning
132
 
133
  tmp_path = STATUS_FILE.with_suffix(".tmp")
134
  try:
 
365
  print(f"Hermes sync error: {exc}")
366
  return 1
367
 
368
+ warning = env_warning_payload()
369
+ if warning is not None:
370
+ # Loud, single-line, easy to grep in HF Space logs.
371
+ print(f"Hermes sync WARNING: {warning['message']}")
372
+
373
  # Seed from any prior run so we don't re-upload an identical tree.
374
  last_fingerprint: str | None = None
375
  last_marker: tuple[int, int, int] | None = None