Spaces:
Running
start.sh: LOW_MEM=1 short-circuits to status server only (kill all bg daemons)
Browse filesThe earlier patch (a5c37dd) gated only the 5 boot-time harvest launchers
behind LOW_MEM. Watchdog history showed Spaces stayed green for ~25 min
after that fix (#12-#16 all 6/6) but went HTTP-hung again at sweep #17.
Investigation found 15+ MORE nohup'd daemons below the gated section that
collectively still walk the Space into the 16 GB CPU-Basic cap within an
hour even with the harvest launchers off:
scrape-daemon, agentic-crawler, github-agentic-crawler, self-heal-watchdog,
gh-actions-ticker, llm-burst-generator, bulk-ingest-parallel, parquet-direct-
ingest, skill-synthesis-daemon, bulk-mirror-worker (ΓN), streaming-mirror-
worker (ΓN), continuous-discoverer, plus the hermes-cron.sh while-loop
itself which spawns regression-test, abstract-cot-compressor, etc.
This patch adds an early-return right after the .env write: when LOW_MEM=1
(default on CPU-Basic), exec the status server immediately and skip every
background process below. The Space's only responsibility on free tier is
to serve /cursor/* advance to harvest workers; everything that USED to be
launched here is now scheduled on GCP via hermes-jobs.json (171 jobs as of
this commit).
Re-enable in-Space mode by setting LOW_MEM=0 once the Space is on a paid
tier (cpu-upgrade β₯ 32 GB) or migrated to a larger anchor.
|
@@ -151,6 +151,34 @@ chmod 600 ~/.hermes/.env
|
|
| 151 |
echo "[$(date +%H:%M:%S)] .env written ($(wc -l < ~/.hermes/.env) keys, perms 600)"
|
| 152 |
# Trace OFF for the rest of boot β we already have line numbers above and won't need them post-secrets.
|
| 153 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 154 |
# ββ 3. Git config + clone axentx repos for auto-orchestrate auto-commit ββββ
|
| 155 |
# Disable interactive prompts globally so failed-auth git ops fail fast.
|
| 156 |
export GIT_TERMINAL_PROMPT=0
|
|
|
|
| 151 |
echo "[$(date +%H:%M:%S)] .env written ($(wc -l < ~/.hermes/.env) keys, perms 600)"
|
| 152 |
# Trace OFF for the rest of boot β we already have line numbers above and won't need them post-secrets.
|
| 153 |
|
| 154 |
+
# ββ LOW_MEM short-circuit β skip ALL background daemons, exec status server ββ
|
| 155 |
+
# CPU-Basic Space cap is 16 GB. Even after gating the 5 boot-time harvest
|
| 156 |
+
# launchers, the Space kept hitting 16 GB cap and going hung at HTTP layer
|
| 157 |
+
# every ~30-40 min. Investigation found 15+ MORE nohup'd background daemons
|
| 158 |
+
# below this point (scrape, agentic-crawler, github-crawler, self-heal, cron
|
| 159 |
+
# loop, bulk-mirror workers, streaming-mirror workers, parquet-ingest, etc.)
|
| 160 |
+
# that collectively grow into the cap within an hour.
|
| 161 |
+
#
|
| 162 |
+
# In LOW_MEM=1 mode the Space's only job is the FastAPI status server on
|
| 163 |
+
# :7860 that serves harvest cursor advance to remote workers. Everything
|
| 164 |
+
# else (harvest, mirroring, agent pipeline, training pushes, dataset enrich)
|
| 165 |
+
# now runs on the GCP daemon fleet β see hermes-jobs.json (171 jobs scheduled
|
| 166 |
+
# via hermes-scheduler-daemon as of 2026-05-02).
|
| 167 |
+
#
|
| 168 |
+
# Set LOW_MEM=0 to re-enable in-Space launchers when on a paid tier (β₯32GB).
|
| 169 |
+
if [[ "$LOW_MEM" == "1" ]]; then
|
| 170 |
+
echo "[$(date +%H:%M:%S)] LOW_MEM=1 β skipping all bg daemons + cron, going straight to :7860 status server" | tee -a "$LOG_DIR/boot.log"
|
| 171 |
+
set +x # silence trace
|
| 172 |
+
# Verify deps before exec β print what's missing rather than silent crash
|
| 173 |
+
if python3 -c "import fastapi, uvicorn" 2>/dev/null; then
|
| 174 |
+
echo "[$(date +%H:%M:%S)] starting uvicorn :7860 (LOW_MEM fast-path)" | tee -a "$LOG_DIR/boot.log"
|
| 175 |
+
exec python3 ~/.surrogate/bin/hermes-status-server.py
|
| 176 |
+
else
|
| 177 |
+
echo "β fastapi/uvicorn not importable β falling back to plain http.server"
|
| 178 |
+
exec python3 -m http.server 7860 --bind 0.0.0.0
|
| 179 |
+
fi
|
| 180 |
+
fi
|
| 181 |
+
|
| 182 |
# ββ 3. Git config + clone axentx repos for auto-orchestrate auto-commit ββββ
|
| 183 |
# Disable interactive prompts globally so failed-auth git ops fail fast.
|
| 184 |
export GIT_TERMINAL_PROMPT=0
|