Ashira Pitchayapakayakul commited on
Commit
ff4b1b7
·
1 Parent(s): 52df1e3

fix: 4 dead agents — heredoc-pipe bug, missing CLI, token pool

Browse files

Three runtime fixes catching 'silent agent death' patterns from log audit:

1. surrogate-self-ingest.sh — heredoc/pipe redirection conflict
The original 'sed | python3 - "$INDEX" <<PYEOF' was a black hole:
bash binds heredocs AFTER pipe setup, so python3 read the script body
from stdin (since 'python3 -' takes script from stdin), and sed's
actual jsonl output had nowhere to go. Result was the dead-quiet
'inserted=0 skipped_parse=0 skipped_empty=0' loop we saw 84 times.
Fix: write inline python to a mktemp file, then pipe sed -> python3 file.

2. surrogate-research-loop.sh — $HOME/.local/bin/surrogate doesn't exist
The CLI binary was never installed on this Space, so every cycle was
failing silently in 0s ('research done in 0s' = no work). Replaced
with direct OpenAI-compatible calls falling through Cerebras → Groq
→ OpenRouter using the keys already configured as Space secrets.
Also fixed the optional notify-discord.sh call to skip cleanly if
that script isn't installed either.

3. GITHUB_TOKEN_POOL HF Space secret expanded
Was hitting 5000 search-req/hr ceiling repeatedly. Pushed the 4 usable
PATs (arkashira + midnightcrisis-1 + midnightcrisis-2 + ashiradevops-alt;
ashirap excluded per user mandate) for 20K req/hr ceiling.
Push via HF API in operator history, not committed to repo.

skill-synthesis daemon left as-is for now — its 'total skills=0' is
because the source dirs (/tmp/agentic-discovery, workspace/projects)
are still being populated by the agentic-crawler. Will pick up once
those have content.

Tag: i have control... arios gundam. จะส่องละนะ

bin/surrogate-research-loop.sh CHANGED
@@ -65,17 +65,51 @@ Then write a 1-line action TODO to ${RESEARCH_DIR}/queue.txt for each quick-win,
65
 
66
  Be selective — quality > quantity."
67
 
68
- # ── Run research via surrogate CLI ──────────────────────────────────────────
 
 
 
 
69
  START=$(date +%s)
70
- "$HOME/.local/bin/surrogate" -p --max-steps 8 "$PROMPT" 2>&1 | head -100 >> "$LOG"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
  DUR=$(( $(date +%s) - START ))
72
  echo "[$(date +%H:%M:%S)] research done in ${DUR}s" | tee -a "$LOG"
73
 
74
  # ── Discord notify if new findings worth attention ─────────────────────────
75
  if [[ -f "$OUT" ]] && [[ -s "$OUT" ]]; then
76
  QUICK_WINS=$(grep -c "^apply " "$RESEARCH_DIR/queue.txt" 2>/dev/null || echo 0)
77
- "$HOME/.local/bin/notify-discord.sh" 2>/dev/null info "🔬 Research cycle done" \
78
- "Focus: $FOCUS · ${DUR}s · $(wc -l < "$OUT") lines · $QUICK_WINS quick-wins queued" || true
 
79
  fi
80
 
81
  echo "[$(date +%H:%M:%S)] cycle done" | tee -a "$LOG"
 
65
 
66
  Be selective — quality > quantity."
67
 
68
+ # ── Run research via cloud LLM API directly ────────────────────────────────
69
+ # Original $HOME/.local/bin/surrogate CLI was never installed on this Space,
70
+ # so every cycle was failing silently in 0s. Replaced with direct calls to
71
+ # whichever cloud LLM key is set (Cerebras → Groq → OpenRouter) with
72
+ # automatic fallback if a backend is rate-limited or unavailable.
73
  START=$(date +%s)
74
+ RESEARCH_RESPONSE=""
75
+ for backend in cerebras groq openrouter; do
76
+ case "$backend" in
77
+ cerebras) url="https://api.cerebras.ai/v1/chat/completions"; key="${CEREBRAS_API_KEY:-}"; model="qwen-3-coder-480b" ;;
78
+ groq) url="https://api.groq.com/openai/v1/chat/completions"; key="${GROQ_API_KEY:-}"; model="qwen/qwen3-32b" ;;
79
+ openrouter) url="https://openrouter.ai/api/v1/chat/completions"; key="${OPENROUTER_API_KEY:-}"; model="qwen/qwen3-coder:free" ;;
80
+ esac
81
+ [[ -z "$key" ]] && { echo " [$backend] no key — skip" >> "$LOG"; continue; }
82
+ RESEARCH_RESPONSE=$(curl -sS --max-time 90 "$url" \
83
+ -H "Authorization: Bearer $key" \
84
+ -H "Content-Type: application/json" \
85
+ -d "$(python3 -c "import json,sys; print(json.dumps({'model':sys.argv[1],'messages':[{'role':'user','content':sys.argv[2]}],'max_tokens':4000,'temperature':0.4}))" "$model" "$PROMPT")" 2>>"$LOG" \
86
+ | python3 -c "import json,sys; d=json.load(sys.stdin); print(d.get('choices',[{}])[0].get('message',{}).get('content',''))" 2>>"$LOG" || true)
87
+ if [[ -n "$RESEARCH_RESPONSE" ]]; then
88
+ echo " [$backend] response: $(echo "$RESEARCH_RESPONSE" | wc -c) chars" >> "$LOG"
89
+ break
90
+ fi
91
+ echo " [$backend] empty/error — try next" >> "$LOG"
92
+ done
93
+
94
+ if [[ -n "$RESEARCH_RESPONSE" ]]; then
95
+ {
96
+ echo "# Research cycle: $FOCUS ($CYCLE_TS)"
97
+ echo ""
98
+ echo "$RESEARCH_RESPONSE"
99
+ } > "$OUT"
100
+ # Extract any 'apply ...' lines into the queue
101
+ echo "$RESEARCH_RESPONSE" | grep -E "^apply " >> "$RESEARCH_DIR/queue.txt" 2>/dev/null || true
102
+ fi
103
+
104
  DUR=$(( $(date +%s) - START ))
105
  echo "[$(date +%H:%M:%S)] research done in ${DUR}s" | tee -a "$LOG"
106
 
107
  # ── Discord notify if new findings worth attention ─────────────────────────
108
  if [[ -f "$OUT" ]] && [[ -s "$OUT" ]]; then
109
  QUICK_WINS=$(grep -c "^apply " "$RESEARCH_DIR/queue.txt" 2>/dev/null || echo 0)
110
+ [[ -x "$HOME/.local/bin/notify-discord.sh" ]] && \
111
+ "$HOME/.local/bin/notify-discord.sh" info "🔬 Research cycle done" \
112
+ "Focus: $FOCUS · ${DUR}s · $(wc -l < "$OUT") lines · $QUICK_WINS quick-wins queued" 2>/dev/null || true
113
  fi
114
 
115
  echo "[$(date +%H:%M:%S)] cycle done" | tee -a "$LOG"
bin/surrogate-self-ingest.sh CHANGED
@@ -42,7 +42,16 @@ TAKE=$NEW
42
  [[ $TAKE -gt $BATCH_SIZE ]] && TAKE=$BATCH_SIZE
43
  echo "[$(date +%H:%M:%S)] processing $TAKE / $NEW (batch_size=$BATCH_SIZE)" | tee -a "$LOG"
44
 
45
- sed -n "$((PREV + 1)),$((PREV + TAKE))p" "$SRC" | python3 - "$INDEX" >> "$LOG" 2>&1 <<'PYEOF'
 
 
 
 
 
 
 
 
 
46
  import sys, json, sqlite3
47
  db = sys.argv[1]
48
  con = sqlite3.connect(db)
@@ -59,8 +68,6 @@ for line in sys.stdin:
59
  ts = d.get("ts", 0)
60
  prompt = (d.get("prompt") or "")[:4000]
61
  response = (d.get("response") or "")[:8000]
62
- # Relaxed filter: index anything with both fields present (was 50-char min)
63
- # Even short pairs are useful for tag-based retrieval
64
  if not prompt or not response:
65
  skipped_short += 1
66
  continue
@@ -76,6 +83,9 @@ con.commit()
76
  print(f" inserted={n} skipped_parse={skipped_parse} skipped_empty={skipped_short}", flush=True)
77
  PYEOF
78
 
 
 
 
79
  # Advance offset by what we actually processed
80
  NEW_OFFSET=$(( PREV + TAKE ))
81
  echo "$NEW_OFFSET" > "$OFFSET_FILE"
 
42
  [[ $TAKE -gt $BATCH_SIZE ]] && TAKE=$BATCH_SIZE
43
  echo "[$(date +%H:%M:%S)] processing $TAKE / $NEW (batch_size=$BATCH_SIZE)" | tee -a "$LOG"
44
 
45
+ # Bug fix: previously `sed | python3 - "$INDEX" <<'PYEOF'` had a redirection
46
+ # conflict — bash's heredoc binds to python3's stdin AFTER the pipe, so the
47
+ # script body (PYEOF block) was being read as stdin (and consumed once for
48
+ # 'python3 -'), leaving sed's actual jsonl output unreachable. Result was
49
+ # `inserted=0 skipped_parse=0 skipped_empty=0` — a silent black hole.
50
+ #
51
+ # Fix: write the inline python to a temp file, then run with sed piped in.
52
+ # Now stdin = the actual jsonl lines, exactly as intended.
53
+ INGEST_PY=$(mktemp -t self-ingest-XXXXXX.py)
54
+ cat > "$INGEST_PY" <<'PYEOF'
55
  import sys, json, sqlite3
56
  db = sys.argv[1]
57
  con = sqlite3.connect(db)
 
68
  ts = d.get("ts", 0)
69
  prompt = (d.get("prompt") or "")[:4000]
70
  response = (d.get("response") or "")[:8000]
 
 
71
  if not prompt or not response:
72
  skipped_short += 1
73
  continue
 
83
  print(f" inserted={n} skipped_parse={skipped_parse} skipped_empty={skipped_short}", flush=True)
84
  PYEOF
85
 
86
+ sed -n "$((PREV + 1)),$((PREV + TAKE))p" "$SRC" | python3 "$INGEST_PY" "$INDEX" >> "$LOG" 2>&1
87
+ rm -f "$INGEST_PY"
88
+
89
  # Advance offset by what we actually processed
90
  NEW_OFFSET=$(( PREV + TAKE ))
91
  echo "$NEW_OFFSET" > "$OFFSET_FILE"