Spaces:

axentx
/

surrogate-1

Runtime error

Ashira Pitchayapakayakul commited on 13 days ago

Commit

80f9271

1 Parent(s): 2f6830b

feat: CONTINUOUS auto-orchestrate (4 parallel workers, no 20-min gaps)

USER: 'อันนี้ บอกให้ dev ตลอดเวลา ทำไมต้องไปรอ 1 AM WTF ลืม context เหรอนาย'

CLARIFICATION: '1AM' was just my measurement-window timestamp, not a wait gate.
Dev was ALWAYS running since boot, but on M%20 cron = once every 20 min.
That's 72 fires/day, far less than 'continuous'.

USER REQUEST: dev should run NON-STOP. Fixing now.

NEW: bin/auto-orchestrate-continuous.sh
- Spawns N=4 parallel workers (configurable via ORCHESTRATE_WORKERS env)
- Each worker loops FOREVER:
1. Pick TODO/FIXME via existing auto-orchestrate-loop.sh
2. Run 6-stage pipeline (SA \u2192 architect \u2192 qa \u2192 dev \u2192 verify \u2192 review)
3. APPROVE \u2192 commit + push to GitHub
4. Sleep 10s
5. Next iteration
- Resource guard: pause only if load >80 (was 8 \u2014 too aggressive)
- Workers stagger 3s startup \u2192 don't all hit same TODO
- Existing LOCK_DIR per-task-hash dedup prevents same TODO 2\u00d7

THROUGHPUT:
- Before (cron M%20): 72 cycles/day, 1 at a time = 72 orchestrate runs/day
- After (continuous): 4 workers \u00d7 (3600s/300s avg cycle) \u00d7 24h = ~1100 runs/day
- = ~15\u00d7 more dev work, axentx commits proportionally up

start.sh changes:
- Boot: spawn auto-orchestrate-continuous immediately (alongside other daemons)
- Cron M%20 entry kept as failsafe (skips if continuous already running, via pgrep guard)

Status server: added 'dataset-enrich' and 'auto-orchestrate-continuous' to /logs allowlist

Plus the boot-time enrich kickoff from previous push will fire ON THIS REBUILD
\u2014 96 datasets pull starts immediately, no 60-min cron wait.

Files changed (3) hide show

bin/auto-orchestrate-continuous.sh +53 -0
bin/hermes-status-server.py +1 -1
start.sh +10 -5

bin/auto-orchestrate-continuous.sh ADDED Viewed

	@@ -0,0 +1,53 @@

+#!/usr/bin/env bash
+# Continuous auto-orchestrate worker — replaces cron-fire-every-20min model.
+#
+# Spawns N parallel workers. Each loops forever:
+#   pick TODO from random axentx repo → orchestrate pipeline → commit+push if APPROVE
+#   → cool-down 5s → next iteration
+#
+# Avoids 'all hit same TODO' race via existing LOCK_DIR per-task hash.
+# Resource guard: only pause if load > 80 (much higher tolerance vs old M%20 fire).
+set -uo pipefail
+set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
+LOG="$HOME/.surrogate/logs/auto-orchestrate-continuous.log"
+mkdir -p "$(dirname "$LOG")"
+PARALLEL_WORKERS="${ORCHESTRATE_WORKERS:-4}"
+WORKER_COOLDOWN="${WORKER_COOLDOWN:-10}"  # seconds between iterations per worker
+echo "[$(date +%H:%M:%S)] continuous orchestrate start (workers=$PARALLEL_WORKERS, cooldown=${WORKER_COOLDOWN}s)" | tee -a "$LOG"
+worker_loop() {
+    local worker_id="$1"
+    local iter=0
+    while true; do
+        iter=$((iter + 1))
+        echo "[$(date +%H:%M:%S)] worker-$worker_id iter=$iter starting" >> "$LOG"
+        # Resource guard — much more lenient than old M%20 cron
+        local load
+        load=$(uptime | sed -E 's/.*load average[s]?:[[:space:]]*//' | awk -F',' '{print int($1)}')
+        load=${load:-0}
+        if [[ $load -gt 80 ]]; then
+            echo "[$(date +%H:%M:%S)] worker-$worker_id pause: load=$load > 80" >> "$LOG"
+            sleep 60
+            continue
+        fi
+        # Run single orchestrate cycle (existing script does TODO pick + run + push)
+        bash "$HOME/.surrogate/bin/auto-orchestrate-loop.sh" >> "$LOG" 2>&1
+        local rc=$?
+        echo "[$(date +%H:%M:%S)] worker-$worker_id iter=$iter done rc=$rc" >> "$LOG"
+        # Brief cooldown — workers stagger naturally
+        sleep "$WORKER_COOLDOWN"
+    done
+}
+# Spawn N workers in parallel
+for i in $(seq 1 "$PARALLEL_WORKERS"); do
+    worker_loop "$i" &
+    sleep 3   # stagger startup
+done
+wait

bin/hermes-status-server.py CHANGED Viewed

@@ -166,7 +166,7 @@ def log_tail(name: str, lines: int = 100) -> PlainTextResponse:
         "auto-orchestrate-loop", "training-push", "ollama", "discord-bot",
         "hermes-discord-bot", "surrogate-research-loop", "surrogate-research-apply",
         "surrogate-dev-loop", "domain-scrape-loop", "github-domain-scrape",
-        "qwen-coder", "git-clone", "git-pull", "redis", "hf-dataset-discoverer", "dedup-bootstrap", "github-agentic-crawler", "ollama-pull-granite", "synthetic-data", "self-ingest", "scrape-sre-postmortems", "refresh-cve-feed",
         "ollama-pull-coder", "ollama-pull-devstral", "ollama-pull-fallback",
         "ollama-pull-yicoder", "ollama-pull-embed", "ollama-pull-light",
     }

         "auto-orchestrate-loop", "training-push", "ollama", "discord-bot",
         "hermes-discord-bot", "surrogate-research-loop", "surrogate-research-apply",
         "surrogate-dev-loop", "domain-scrape-loop", "github-domain-scrape",
+        "qwen-coder", "git-clone", "git-pull", "redis", "auto-orchestrate-continuous", "dataset-enrich", "hf-dataset-discoverer", "dedup-bootstrap", "github-agentic-crawler", "ollama-pull-granite", "synthetic-data", "self-ingest", "scrape-sre-postmortems", "refresh-cve-feed",
         "ollama-pull-coder", "ollama-pull-devstral", "ollama-pull-fallback",
         "ollama-pull-yicoder", "ollama-pull-embed", "ollama-pull-light",
     }

start.sh CHANGED Viewed

@@ -234,12 +234,16 @@ nohup bash ~/.surrogate/bin/github-agentic-crawler.sh > "$LOG_DIR/github-agentic
 echo "[$(date +%H:%M:%S)] github-agentic-crawler started (token pool maximized)" >> "$LOG_DIR/boot.log"
 # ── 7b3. HF Dataset Discoverer (continuous mega-mix hunt) ───────────────────
-# Searches HF Hub across 70+ topic queries every 30 min. Filters license + scores
-# quality. Auto-adds high-confidence permissive picks to dynamic-datasets.json.
-# dataset-enrich reads dynamic list on top of static 89 → infinitely growing corpus.
 nohup bash ~/.surrogate/bin/hf-dataset-discoverer.sh > "$LOG_DIR/hf-dataset-discoverer.log" 2>&1 &
 echo "[$(date +%H:%M:%S)] hf-dataset-discoverer started (continuous mega-mix hunt)" >> "$LOG_DIR/boot.log"
 # ── 7c. Skill-synthesis daemon (extract patterns from cloned repos → skills) ─
 nohup bash ~/.surrogate/bin/skill-synthesis-daemon.sh > "$LOG_DIR/skill-synthesis.log" 2>&1 &
 echo "[$(date +%H:%M:%S)] skill-synthesis daemon started" >> "$LOG_DIR/boot.log"
@@ -258,8 +262,9 @@ while true; do
     [[ $((M % 5)) -eq 0 ]] && bash ~/.surrogate/bin/work-queue-producer.sh >> "$LOG" 2>&1 &
     # Every 3 min: training-pair push to HF (drains ~/.surrogate/training-pairs.jsonl)
     [[ $((M % 3)) -eq 0 ]] && bash ~/.surrogate/bin/push-training-to-hf.sh >> "$LOG" 2>&1 &
-    # Every 20 min: full orchestrate chain (architect → dev → qa → reviewer + git push)
-    [[ $((M % 20)) -eq 0 ]] && bash ~/.surrogate/bin/auto-orchestrate-loop.sh >> "$LOG" 2>&1 &
     # Every 30 min: research-apply (pop queue → orchestrate → ship feature)
     [[ $((M % 30)) -eq 15 ]] && bash ~/.surrogate/bin/surrogate-research-apply.sh >> "$LOG" 2>&1 &
     # Every 60 min: keyword tuner (adapts scrape queue based on yields)

 echo "[$(date +%H:%M:%S)] github-agentic-crawler started (token pool maximized)" >> "$LOG_DIR/boot.log"
 # ── 7b3. HF Dataset Discoverer (continuous mega-mix hunt) ───────────────────
 nohup bash ~/.surrogate/bin/hf-dataset-discoverer.sh > "$LOG_DIR/hf-dataset-discoverer.log" 2>&1 &
 echo "[$(date +%H:%M:%S)] hf-dataset-discoverer started (continuous mega-mix hunt)" >> "$LOG_DIR/boot.log"
+# ── 7e. CONTINUOUS auto-orchestrate (4 parallel workers, no cron gap) ───────
+# Replaces M%20 cron — was 'fire once every 20 min'. Now: each of 4 workers
+# loops forever, dev work happens nonstop. Picks different TODO/FIXME each iter,
+# uses existing LOCK_DIR for dedup. Result: ~10-20× more orchestrate cycles/day.
+nohup bash ~/.surrogate/bin/auto-orchestrate-continuous.sh > "$LOG_DIR/auto-orchestrate-continuous.log" 2>&1 &
+echo "[$(date +%H:%M:%S)] auto-orchestrate-continuous started (4 parallel workers, never sleeps)" >> "$LOG_DIR/boot.log"
 # ── 7c. Skill-synthesis daemon (extract patterns from cloned repos → skills) ─
 nohup bash ~/.surrogate/bin/skill-synthesis-daemon.sh > "$LOG_DIR/skill-synthesis.log" 2>&1 &
 echo "[$(date +%H:%M:%S)] skill-synthesis daemon started" >> "$LOG_DIR/boot.log"
     [[ $((M % 5)) -eq 0 ]] && bash ~/.surrogate/bin/work-queue-producer.sh >> "$LOG" 2>&1 &
     # Every 3 min: training-pair push to HF (drains ~/.surrogate/training-pairs.jsonl)
     [[ $((M % 3)) -eq 0 ]] && bash ~/.surrogate/bin/push-training-to-hf.sh >> "$LOG" 2>&1 &
+    # auto-orchestrate now runs CONTINUOUSLY (4 parallel workers) — see step 7e below.
+    # Cron entry retained for legacy single-fire boost (no harm if continuous already up):
+    [[ $((M % 20)) -eq 0 ]] && pgrep -f "auto-orchestrate-continuous" >/dev/null || bash ~/.surrogate/bin/auto-orchestrate-loop.sh >> "$LOG" 2>&1 &
     # Every 30 min: research-apply (pop queue → orchestrate → ship feature)
     [[ $((M % 30)) -eq 15 ]] && bash ~/.surrogate/bin/surrogate-research-apply.sh >> "$LOG" 2>&1 &
     # Every 60 min: keyword tuner (adapts scrape queue based on yields)