Ashira Pitchayapakayakul commited on
Commit
e36381e
·
1 Parent(s): eaaf1cf

feat: migrate $HOME/.claude/* to $HOME/.surrogate/* (clean separation from Claude Code)

Browse files

Surrogate now has its own home — ~/.surrogate/ — fully separate from ~/.claude/.

- Dockerfile: COPY bin/ → ~/.surrogate/bin/, PATH includes ~/.surrogate/bin first
- Dockerfile: backward-compat symlinks (~/.claude/bin → ~/.surrogate/bin) for legacy refs
- start.sh: persist subdirs (state, logs, memory, skills, sessions) + workspace + ollama + training-pairs.jsonl into /data
- LOG_DIR: ~/.surrogate/logs (was ~/.claude/logs)
- 40+ refs migrated across 25+ scripts via sed
- 13 missing scripts (bridges, ask-sqlite, scrape-ledger-init, surrogate-bridge, surrogate-consolidate, lib/) copied from Mac
- All scripts: bash -n syntax OK

This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. Dockerfile +22 -13
  2. bin/agentic-crawler.sh +2 -2
  3. bin/ai-fallback.sh +422 -0
  4. bin/ask-sqlite.py +175 -0
  5. bin/auto-orchestrate-loop.sh +3 -3
  6. bin/cerebras-bridge.sh +58 -0
  7. bin/chutes-bridge.sh +59 -0
  8. bin/cloudflare-bridge.sh +76 -0
  9. bin/crawl-rss.py +2 -2
  10. bin/daily-crawl.sh +5 -5
  11. bin/dataset-enrich.sh +2 -2
  12. bin/dev-cloud-daemon.sh +4 -4
  13. bin/dev-cloud-worker.sh +13 -13
  14. bin/domain-scrape-loop.sh +5 -5
  15. bin/github-bridge.sh +94 -0
  16. bin/github-domain-scrape.sh +3 -3
  17. bin/graph-sync.sh +107 -0
  18. bin/groq-bridge.sh +59 -0
  19. bin/hermes-daily-summary.sh +9 -9
  20. bin/hermes-discord-bot.py +5 -5
  21. bin/hermes-status-server.py +4 -4
  22. bin/lib/__init__.py +0 -0
  23. bin/lib/bridge_retry.py +142 -0
  24. bin/lib/checkpoint.py +146 -0
  25. bin/lib/codebase_scanner.py +225 -0
  26. bin/lib/context_builder.sh +272 -0
  27. bin/lib/dns_fallback.py +27 -0
  28. bin/lib/ground_truth.py +280 -0
  29. bin/lib/max_client.py +365 -0
  30. bin/lib/openrouter_client.py +195 -0
  31. bin/lib/prompt_cache.py +17 -0
  32. bin/lib/review_agent.py +328 -0
  33. bin/lib/smart_dispatcher.py +420 -0
  34. bin/lib/tier_rank.py +192 -0
  35. bin/notify-discord.sh +1 -1
  36. bin/nvidia-bridge.sh +59 -0
  37. bin/perf-watchdog.sh +3 -3
  38. bin/push-training-to-hf.sh +1 -1
  39. bin/qwen-coder-daemon.sh +2 -2
  40. bin/qwen-coder-worker.sh +3 -3
  41. bin/sambanova-bridge.sh +75 -0
  42. bin/scrape-keyword-tuner.sh +2 -2
  43. bin/scrape-ledger-init.sh +123 -0
  44. bin/skill-synthesis-daemon.sh +1 -1
  45. bin/surrogate +19 -19
  46. bin/surrogate-agent.sh +11 -11
  47. bin/surrogate-bridge.sh +52 -0
  48. bin/surrogate-consolidate.sh +163 -0
  49. bin/surrogate-daemon.sh +7 -7
  50. bin/surrogate-dev-loop.sh +3 -3
Dockerfile CHANGED
@@ -1,5 +1,5 @@
1
- # Hermes on Hugging Face Spaces (CPU 16 GB)
2
- # Single-container that runs Ollama + Redis + all Hermes daemons.
3
  FROM python:3.12-slim
4
 
5
  # ── System deps ──────────────────────────────────────────────────────────────
@@ -14,32 +14,41 @@ RUN curl -fsSL https://ollama.com/install.sh | sh
14
  # ── App user (HF Spaces requires uid 1000) ──────────────────────────────────
15
  RUN useradd -m -u 1000 hermes
16
  ENV HOME=/home/hermes \
17
- PATH=/home/hermes/.local/bin:/usr/local/bin:/usr/bin:/bin \
 
18
  HERMES_HOME=/home/hermes/.hermes \
19
  PYTHONUNBUFFERED=1
20
 
21
  WORKDIR /home/hermes
22
 
23
- # ── Python deps for Hermes Discord bot + scrape + RAG ───────────────────────
24
  COPY --chown=hermes:hermes requirements.txt /tmp/requirements.txt
25
  RUN pip install --no-cache-dir -r /tmp/requirements.txt
26
 
27
- # ── Copy Hermes scripts + config skeleton ───────────────────────────────────
28
- COPY --chown=hermes:hermes bin/ /home/hermes/.claude/bin/
 
29
  COPY --chown=hermes:hermes config/ /home/hermes/.hermes/config/
30
  COPY --chown=hermes:hermes start.sh /home/hermes/start.sh
31
- # start.sh orchestrates everything (Redis + Ollama + daemons + status server) — no supervisord needed
32
- RUN chmod +x /home/hermes/.claude/bin/*.sh /home/hermes/start.sh
33
 
34
  USER hermes
35
 
36
- # ── Persistent dirs (HF mounts /data) ────────────────────────────────────────
37
- RUN mkdir -p /home/hermes/.claude/state /home/hermes/.claude/logs \
38
- /home/hermes/.surrogate /home/hermes/.hermes/workspace \
39
- /home/hermes/.ollama
 
 
 
 
 
 
 
 
 
40
 
41
  # ── Expose port 7860 (HF default) ────────────────────────────────────────────
42
  EXPOSE 7860
43
 
44
- # Run supervisord — manages ollama + redis + all hermes daemons
45
  CMD ["/home/hermes/start.sh"]
 
1
+ # Surrogate-1 on Hugging Face Spaces (CPU 16 GB)
2
+ # Single-container that runs Ollama + Redis + all Surrogate daemons.
3
  FROM python:3.12-slim
4
 
5
  # ── System deps ──────────────────────────────────────────────────────────────
 
14
  # ── App user (HF Spaces requires uid 1000) ──────────────────────────────────
15
  RUN useradd -m -u 1000 hermes
16
  ENV HOME=/home/hermes \
17
+ PATH=/home/hermes/.surrogate/bin:/home/hermes/.local/bin:/usr/local/bin:/usr/bin:/bin \
18
+ SURROGATE_HOME=/home/hermes/.surrogate \
19
  HERMES_HOME=/home/hermes/.hermes \
20
  PYTHONUNBUFFERED=1
21
 
22
  WORKDIR /home/hermes
23
 
24
+ # ── Python deps for Discord bot + scrape + RAG ──────────────────────────────
25
  COPY --chown=hermes:hermes requirements.txt /tmp/requirements.txt
26
  RUN pip install --no-cache-dir -r /tmp/requirements.txt
27
 
28
+ # ── Copy Surrogate scripts + config skeleton ────────────────────────────────
29
+ # Surrogate's home: ~/.surrogate/bin/ (separate from Claude Code's ~/.claude/)
30
+ COPY --chown=hermes:hermes bin/ /home/hermes/.surrogate/bin/
31
  COPY --chown=hermes:hermes config/ /home/hermes/.hermes/config/
32
  COPY --chown=hermes:hermes start.sh /home/hermes/start.sh
33
+ RUN chmod +x /home/hermes/.surrogate/bin/*.sh /home/hermes/start.sh
 
34
 
35
  USER hermes
36
 
37
+ # ── Persistent dirs (HF mounts /data into ~/.surrogate symlink) ─────────────
38
+ RUN mkdir -p /home/hermes/.surrogate/state /home/hermes/.surrogate/logs \
39
+ /home/hermes/.surrogate/workspace /home/hermes/.surrogate/memory \
40
+ /home/hermes/.surrogate/skills /home/hermes/.surrogate/sessions \
41
+ /home/hermes/.hermes/workspace /home/hermes/.ollama
42
+
43
+ # ── Backward-compat: legacy refs to ~/.claude/bin/ + ~/.claude/logs/ ────────
44
+ # Some scripts still reference old paths; symlink prevents breakage during
45
+ # progressive migration. Eventually all callers should use ~/.surrogate/.
46
+ RUN mkdir -p /home/hermes/.claude && \
47
+ ln -sfn /home/hermes/.surrogate/bin /home/hermes/.claude/bin && \
48
+ ln -sfn /home/hermes/.surrogate/logs /home/hermes/.claude/logs && \
49
+ ln -sfn /home/hermes/.surrogate/state /home/hermes/.claude/state
50
 
51
  # ── Expose port 7860 (HF default) ────────────────────────────────────────────
52
  EXPOSE 7860
53
 
 
54
  CMD ["/home/hermes/start.sh"]
bin/agentic-crawler.sh CHANGED
@@ -9,8 +9,8 @@
9
  set -uo pipefail
10
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
11
 
12
- DB="$HOME/.claude/state/agentic-frontier.db"
13
- LOG="$HOME/.claude/logs/agentic-crawler.log"
14
  PAIRS="$HOME/.surrogate/training-pairs.jsonl"
15
  mkdir -p "$(dirname "$DB")" "$(dirname "$LOG")" "$(dirname "$PAIRS")"
16
 
 
9
  set -uo pipefail
10
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
11
 
12
+ DB="$HOME/.surrogate/state/agentic-frontier.db"
13
+ LOG="$HOME/.surrogate/logs/agentic-crawler.log"
14
  PAIRS="$HOME/.surrogate/training-pairs.jsonl"
15
  mkdir -p "$(dirname "$DB")" "$(dirname "$LOG")" "$(dirname "$PAIRS")"
16
 
bin/ai-fallback.sh ADDED
@@ -0,0 +1,422 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # AI Fallback Chain (cost-optimized, cloud-only, no local LLM)
3
+ #
4
+ # Priority chain:
5
+ # 1. Claude Opus 4.7 via Max subscription (primary, flat $100/mo)
6
+ # 2. Claude Sonnet 4.6 via Max subscription (separate quota pool!)
7
+ # 3. OpenRouter pay-per-use (cheap+capable non-Sonnet picks)
8
+ # 4. Gemini 2.5 FL FREE 1000/day
9
+ # 5. Groq Llama-3.3 FREE 1000/day
10
+ #
11
+ # Usage:
12
+ # ai-fallback.sh "your question"
13
+ # ai-fallback.sh --force gpt5 "your question"
14
+ # ai-fallback.sh --tier cheap "your question" # OpenRouter uses DeepSeek
15
+ # ai-fallback.sh --skip claude-opus "your question"
16
+ set -e
17
+
18
+ # Source API keys FIRST — load BOTH env files (hermes + claude).
19
+ # Order matters: claude.env first, hermes.env wins on conflict
20
+ # (hermes has newer keys like GITHUB_MODELS_TOKEN, SAMBANOVA_API_KEY, CLOUDFLARE_*)
21
+ # shellcheck disable=SC1090
22
+ set -a
23
+ [ -f "$HOME/.surrogate/.env" ] && . "$HOME/.surrogate/.env"
24
+ [ -f "$HOME/.hermes/.env" ] && . "$HOME/.hermes/.env"
25
+ set +a
26
+
27
+ QUERY=""
28
+ FORCE=""
29
+ SKIP=""
30
+ VERBOSE=0
31
+ TASK=""
32
+ export OR_TIER=""
33
+
34
+ while [ $# -gt 0 ]; do
35
+ case "$1" in
36
+ --force) FORCE="$2"; shift 2 ;;
37
+ --skip) SKIP="$2"; shift 2 ;;
38
+ --tier) export OR_TIER="$2"; shift 2 ;;
39
+ --task) TASK="$2"; shift 2 ;;
40
+ --cheap) export OR_TIER="cheap"; shift ;;
41
+ --fast) export OR_TIER="fast"; shift ;;
42
+ --balanced) export OR_TIER="balanced"; shift ;;
43
+ --premium) export OR_TIER="premium"; shift ;;
44
+ -v|--verbose) VERBOSE=1; shift ;;
45
+ *) QUERY="$QUERY $1"; shift ;;
46
+ esac
47
+ done
48
+ QUERY=$(echo "$QUERY" | /usr/bin/sed 's/^ *//')
49
+ [ -z "$QUERY" ] && { /usr/bin/head -15 "$0"; exit 1; }
50
+
51
+ # --task <type> — pick the strongest free model per provider for the task.
52
+ # Sets per-provider env vars that try_* functions read (bridge --model alias).
53
+ # Auto-detect if not provided: code keywords → coding, reasoning keywords → reasoning.
54
+ if [ -z "$TASK" ]; then
55
+ q_lower=$(echo "$QUERY" | /usr/bin/tr '[:upper:]' '[:lower:]')
56
+ if echo "$q_lower" | /usr/bin/grep -qE "code|function|implement|refactor|bug|class|method|api|sql|terraform|cloudformation|dockerfile|kubernetes|yaml|typescript|javascript|python|rust|golang"; then
57
+ TASK="coding"
58
+ elif echo "$q_lower" | /usr/bin/grep -qE "analyze|reason|explain why|compare|evaluate|architect|design|trade-?off|deep|think step|proof|calculate|complex"; then
59
+ TASK="reasoning"
60
+ fi
61
+ fi
62
+
63
+ case "$TASK" in
64
+ coding)
65
+ # Code = Codestral (GitHub, Mistral) / DeepSeek-V3.1 (SambaNova) / Qwen Coder (local)
66
+ export GITHUB_MODEL="codestral" ; export SAMBANOVA_MODEL="deepseek"
67
+ export CLOUDFLARE_MODEL="deepseek" ; export GROQ_MODEL="qwen"
68
+ export LOCAL_MODEL="qwen-coder"
69
+ ;;
70
+ reasoning)
71
+ # Reasoning = DeepSeek R1 (GitHub, <think> CoT) / Grok 3 / DeepSeek R1 distill (CF)
72
+ export GITHUB_MODEL="reasoning" ; export SAMBANOVA_MODEL="deepseek-latest"
73
+ export CLOUDFLARE_MODEL="reasoning" ; export GROQ_MODEL="qwen"
74
+ export LOCAL_MODEL="granite"
75
+ ;;
76
+ fast)
77
+ # Fast = smallest/quickest tier per provider
78
+ export GITHUB_MODEL="mini" ; export SAMBANOVA_MODEL="fast"
79
+ export CLOUDFLARE_MODEL="fast" ; export GROQ_MODEL="fast"
80
+ export LOCAL_MODEL="tiny"
81
+ ;;
82
+ long-context|long|kimi)
83
+ # 200k+ context — Kimi on CF, gpt-oss-120b elsewhere
84
+ export GITHUB_MODEL="llama405" ; export SAMBANOVA_MODEL="gpt-oss"
85
+ export CLOUDFLARE_MODEL="kimi" ; export GROQ_MODEL="gpt-oss"
86
+ export LOCAL_MODEL="granite"
87
+ ;;
88
+ creative|chat|*)
89
+ # Default — smartest general-purpose free model per provider
90
+ export GITHUB_MODEL="gpt-4o" ; export SAMBANOVA_MODEL="llama70"
91
+ export CLOUDFLARE_MODEL="gpt-oss" ; export GROQ_MODEL="llama70"
92
+ export LOCAL_MODEL="granite"
93
+ ;;
94
+ esac
95
+
96
+ # --- Semantic RAG context injection (embedding-powered) ---
97
+ # For coding/reasoning/creative tasks, fetch top-3 semantically similar docs
98
+ # from embeddings.db and prepend to QUERY. ~50ms overhead, improves grounding.
99
+ if [[ "$TASK" == "coding" || "$TASK" == "reasoning" || "$TASK" == "creative" ]]; then
100
+ if [[ -f "$HOME/.surrogate/embeddings.db" ]]; then
101
+ EMB_COUNT=$(/usr/bin/sqlite3 "$HOME/.surrogate/embeddings.db" 'SELECT COUNT(*) FROM embeddings' 2>/dev/null || echo 0)
102
+ if [[ "$EMB_COUNT" -ge 100 ]]; then
103
+ SEM_CONTEXT=$(/usr/bin/python3 "$HOME/.surrogate/bin/embed-doc.py" --query "$QUERY" 2>/dev/null | /usr/bin/head -15)
104
+ if [[ -n "$SEM_CONTEXT" ]]; then
105
+ QUERY="=== RAG CONTEXT (top-5 semantic matches from knowledge base) ===
106
+ $SEM_CONTEXT
107
+
108
+ === TASK ===
109
+ $QUERY"
110
+ fi
111
+ fi
112
+ fi
113
+ fi
114
+
115
+ log() { [ $VERBOSE -eq 1 ] && echo "[$(date +%H:%M:%S)] $*" >&2; }
116
+
117
+ # Capture successful response → log to knowledge base (non-blocking)
118
+ save_response() {
119
+ local provider="$1" model="$2" response="$3"
120
+ [ -z "$response" ] && return
121
+ ( "$HOME/.surrogate/bin/log-interaction.sh" "$QUERY" "$response" "$provider" "$model" > /dev/null 2>&1 & ) || true
122
+ }
123
+
124
+ # --- System prompt from knowledge base + auto code-search if code query ---
125
+ build_system_prompt() {
126
+ local kb="" profile="" code_ctx="" q_lower
127
+ [ -f "$HOME/.surrogate/memory/knowledge_index.md" ] && kb="$(/usr/bin/head -50 $HOME/.surrogate/memory/knowledge_index.md)"
128
+ [ -f "$HOME/.surrogate/memory/user_profile.md" ] && profile="$(cat $HOME/.surrogate/memory/user_profile.md)"
129
+
130
+ q_lower=$(echo "$QUERY" | /usr/bin/tr '[:upper:]' '[:lower:]')
131
+ local is_generate=0 is_code=0
132
+ echo "$q_lower" | /usr/bin/grep -qE "code|function|implement|refactor|bug|error|class|method|api|endpoint|schema|model|service|controller|middleware|auth|database|query|sql|deploy|pipeline|terraform|cloudformation|dockerfile|kubernetes|helm|yaml" && is_code=1
133
+ echo "$q_lower" | /usr/bin/grep -qE "create|generate|write|build|new|template|scaffold|design" && is_generate=1
134
+
135
+ if [ "$is_code" = "1" ] && [ -d "$HOME/.surrogate/code-vector-db" ]; then
136
+ if [ "$is_generate" = "1" ] && [ -x "$HOME/.surrogate/bin/find-gold-examples.sh" ]; then
137
+ # Generation task → inject FULL reference files (better style match)
138
+ code_ctx=$("$HOME/.surrogate/bin/find-gold-examples.sh" --top 2 --max-bytes 5000 "$QUERY" 2>/dev/null)
139
+ elif [ -x "$HOME/.surrogate/bin/code-search.sh" ]; then
140
+ # Query task → snippets only (faster)
141
+ code_ctx=$("$HOME/.surrogate/bin/code-search.sh" --top 3 "$QUERY" 2>/dev/null | /usr/bin/head -60)
142
+ fi
143
+ fi
144
+
145
+ local prompt="You are Ashira's AI assistant. Context: $profile
146
+
147
+ Pattern index: $kb"
148
+ if [ -n "$code_ctx" ]; then
149
+ prompt="$prompt
150
+
151
+ === ASHIRA'S EXISTING CODE (match this style EXACTLY) ===
152
+ $code_ctx
153
+ === END EXAMPLES ===
154
+
155
+ Style rules enforced:
156
+ - Follow naming/indent/comment style from examples above
157
+ - Use exact same Parameter/Resource names when applicable
158
+ - Preserve existing conventions (tags, naming, Description format)"
159
+ fi
160
+ prompt="$prompt
161
+
162
+ Be concise. Cite file paths when referencing existing code."
163
+ echo "$prompt"
164
+ }
165
+ SYSTEM=$(build_system_prompt)
166
+
167
+ # --- Anthropic via Max plan (routes through claude-bridge.sh CLI) ---
168
+ # Direct HTTPS to api.anthropic.com with OAuth token returns 401 — OAuth flow
169
+ # is managed by `claude` CLI (keychain/config). Use the bridge instead.
170
+ try_anthropic() {
171
+ local model="$1" extra="$2"
172
+ log "→ Claude Max: $model"
173
+ local out
174
+ out=$(echo "$QUERY" | "$HOME/.surrogate/bin/claude-bridge.sh" --model "$model" $extra 2>>/tmp/ai-fallback.err) || return 1
175
+ [ -z "$out" ] && return 1
176
+ echo "$out"
177
+ save_response "anthropic" "$model" "$out"
178
+ return 0
179
+ }
180
+
181
+ # Opus needs --force outside 01:00-06:00 window; sonnet is always available
182
+ try_claude_opus() { try_anthropic "opus" "--force"; }
183
+ try_claude_sonnet() { try_anthropic "sonnet" ""; }
184
+
185
+ # OpenRouter FREE — tries multiple free models (each has strict rate limit)
186
+ # Order: coder-first → general-powerhouse → smaller fallbacks
187
+ try_openrouter_free() {
188
+ [ -z "${OPENROUTER_API_KEY:-}" ] && return 2
189
+ local free_models=(
190
+ "qwen/qwen3-coder:free"
191
+ "qwen/qwen3-next-80b-a3b-instruct:free"
192
+ "openai/gpt-oss-120b:free"
193
+ "nvidia/nemotron-3-super-120b-a12b:free"
194
+ "meta-llama/llama-3.3-70b-instruct:free"
195
+ "z-ai/glm-4.5-air:free"
196
+ "google/gemma-4-31b-it:free"
197
+ "openai/gpt-oss-20b:free"
198
+ )
199
+ for m in "${free_models[@]}"; do
200
+ OPENROUTER_MODEL="$m" try_openrouter && return 0
201
+ log " ↳ free '$m' unavailable, trying next free..."
202
+ done
203
+ return 1
204
+ }
205
+
206
+ # --- OpenRouter (cheap+capable non-Sonnet picks) ---
207
+ try_openrouter() {
208
+ [ -z "${OPENROUTER_API_KEY:-}" ] && return 2
209
+ # Default: GPT-5.4 (beats Claude Opus 4.6 per benchmarks, -50% cost vs Opus 4.7)
210
+ local model="${OPENROUTER_MODEL:-openai/gpt-5.4}"
211
+ case "${OR_TIER:-}" in
212
+ # PAID tiers
213
+ cheap) model="deepseek/deepseek-v3.2" ;; # $0.26/$0.42 — cheapest capable
214
+ fast) model="x-ai/grok-4.1-fast" ;; # $0.20/$0.50 — ultra cheap, 2M ctx
215
+ balanced) model="openai/gpt-5.4" ;; # $2.50/$15 — DEFAULT, beats Opus 4.6
216
+ premium) model="anthropic/claude-opus-4.7" ;; # $5/$25 — if really need Opus
217
+ grok) model="x-ai/grok-4.20" ;; # $2/$6 — 2M ctx, cool
218
+ gemini) model="google/gemini-3.1-pro-preview" ;;# $2/$12
219
+ # FREE tiers (29 models available)
220
+ free|free-coder) model="qwen/qwen3-coder:free" ;; # coding, 262k ctx
221
+ free-large) model="qwen/qwen3-next-80b-a3b-instruct:free" ;; # 80B MoE
222
+ free-nvidia) model="nvidia/nemotron-3-super-120b-a12b:free" ;; # 120B
223
+ free-gptoss) model="openai/gpt-oss-120b:free" ;; # OpenAI open-sourced
224
+ free-llama) model="meta-llama/llama-3.3-70b-instruct:free" ;;
225
+ free-kimi) model="moonshotai/kimi-k2.5" ;; # Kimi 256k ctx
226
+ free-glm) model="z-ai/glm-4.5-air:free" ;;
227
+ free-gemma) model="google/gemma-4-31b-it:free" ;; # Google Gemma 4
228
+ esac
229
+ log "→ OpenRouter: $model"
230
+ local body
231
+ # Use env vars — avoids quote-escape hell with multiline system prompt.
232
+ # max_tokens=4000 (GPT-5.4 requires >= 16; stay well above)
233
+ body=$(ORM="$model" SYS="$SYSTEM" Q="$QUERY" "$HOME/.surrogate/venv/bin/python" -c "
234
+ import json, os
235
+ m = {'model':os.environ['ORM'],'max_tokens':4000,
236
+ 'messages':[{'role':'system','content':os.environ['SYS']},
237
+ {'role':'user','content':os.environ['Q']}]}
238
+ print(json.dumps(m))
239
+ " 2>&1) || { log " body-build failed: $body"; return 1; }
240
+ local resp code body_resp
241
+ resp=$(/usr/bin/curl -sS -w "\n%{http_code}" \
242
+ --max-time 90 \
243
+ -X POST "https://openrouter.ai/api/v1/chat/completions" \
244
+ -H "Authorization: Bearer $OPENROUTER_API_KEY" \
245
+ -H "HTTP-Referer: https://ashira.local" \
246
+ -H "X-Title: ai-fallback" \
247
+ -H "content-type: application/json" \
248
+ -d "$body" 2>&1)
249
+ code=$(echo "$resp" | /usr/bin/tail -1)
250
+ body_resp=$(echo "$resp" | /usr/bin/sed '$d')
251
+ if [ "$code" != "200" ]; then
252
+ # Log real error reason for debug
253
+ local errmsg
254
+ errmsg=$(echo "$body_resp" | "$HOME/.surrogate/venv/bin/python" -c "
255
+ import sys, json
256
+ try: d=json.load(sys.stdin); print(d.get('error',{}).get('message','unknown')[:120])
257
+ except: print('parse-fail')
258
+ " 2>/dev/null || echo "unknown")
259
+ log " [$code] $errmsg — falling through"
260
+ return 1
261
+ fi
262
+ local out
263
+ out=$(echo "$body_resp" | "$HOME/.surrogate/venv/bin/python" -c "
264
+ import sys, json
265
+ d = json.load(sys.stdin)
266
+ print(d['choices'][0]['message']['content'])
267
+ ") || return 1
268
+ echo "$out"
269
+ save_response "openrouter" "$model" "$out"
270
+ return 0
271
+ }
272
+
273
+ # --- Gemini (free) ---
274
+ try_gemini() {
275
+ [ -z "${GEMINI_API_KEY:-}" ] && return 2
276
+ local model="${GEMINI_MODEL:-gemini-2.5-flash}"
277
+ log "→ Gemini: $model (free)"
278
+ local body
279
+ body=$("$HOME/.surrogate/venv/bin/python" -c "
280
+ import json
281
+ m = {'systemInstruction':{'parts':[{'text':'''$SYSTEM'''}]},
282
+ 'contents':[{'role':'user','parts':[{'text':'''$QUERY'''}]}],
283
+ 'generationConfig':{'maxOutputTokens':4000}}
284
+ print(json.dumps(m))
285
+ " 2>/dev/null)
286
+ local resp code body_resp
287
+ resp=$(/usr/bin/curl -sS -w "\n%{http_code}" \
288
+ -X POST "https://generativelanguage.googleapis.com/v1beta/models/$model:generateContent?key=$GEMINI_API_KEY" \
289
+ -H "content-type: application/json" -d "$body" 2>&1)
290
+ code=$(echo "$resp" | /usr/bin/tail -1)
291
+ body_resp=$(echo "$resp" | /usr/bin/sed '$d')
292
+ [ "$code" != "200" ] && { log " [$code] falling through"; return 1; }
293
+ local out
294
+ out=$(echo "$body_resp" | "$HOME/.surrogate/venv/bin/python" -c "
295
+ import sys, json
296
+ d = json.load(sys.stdin)
297
+ print(d['candidates'][0]['content']['parts'][0]['text'])
298
+ ") || return 1
299
+ echo "$out"
300
+ save_response "gemini" "$model" "$out"
301
+ return 0
302
+ }
303
+
304
+ # --- Groq (free, ultra-fast) ---
305
+ try_groq() {
306
+ [ -z "${GROQ_API_KEY:-}" ] && return 2
307
+ local model="${GROQ_MODEL:-llama70}"
308
+ log "→ Groq: $model (free)"
309
+ # Route through groq-bridge for consistent alias handling (llama70, fast, qwen, gpt-oss...)
310
+ local out
311
+ out=$(echo "$QUERY" | "$HOME/.surrogate/bin/groq-bridge.sh" --model "$model" 2>>/tmp/ai-fallback.err) || return 1
312
+ [ -z "$out" ] && return 1
313
+ echo "$out"
314
+ save_response "groq" "$model" "$out"
315
+ return 0
316
+ }
317
+
318
+ # --- GitHub Models (free via PAT, OpenAI-compat, GPT-4o-mini/Llama 3.3/Mistral/DeepSeek) ---
319
+ try_github() {
320
+ [ -z "${GITHUB_MODELS_TOKEN:-}${GITHUB_TOKEN:-}" ] && return 2
321
+ local model="${GITHUB_MODEL:-gpt-4o}"
322
+ log "→ GitHub Models: $model (free)"
323
+ local out
324
+ out=$(echo "$QUERY" | "$HOME/.surrogate/bin/github-bridge.sh" --model "$model" 2>>/tmp/ai-fallback.err) || return 1
325
+ [ -z "$out" ] && return 1
326
+ echo "$out"
327
+ save_response "github" "$model" "$out"
328
+ return 0
329
+ }
330
+
331
+ # --- SambaNova Cloud (free, ~500 tok/s Llama 3.3 70B / DeepSeek V3.2 / Llama 4) ---
332
+ try_sambanova() {
333
+ [ -z "${SAMBANOVA_API_KEY:-}" ] && return 2
334
+ local model="${SAMBANOVA_MODEL:-llama70}"
335
+ log "→ SambaNova: $model (free)"
336
+ local out
337
+ out=$(echo "$QUERY" | "$HOME/.surrogate/bin/sambanova-bridge.sh" --model "$model" 2>>/tmp/ai-fallback.err) || return 1
338
+ [ -z "$out" ] && return 1
339
+ echo "$out"
340
+ save_response "sambanova" "$model" "$out"
341
+ return 0
342
+ }
343
+
344
+ # --- Cloudflare Workers AI (free 10k neurons/day, Llama 3.3 / Gemma-3 / Qwen Coder) ---
345
+ try_cloudflare() {
346
+ [ -z "${CLOUDFLARE_API_TOKEN:-}${CF_API_TOKEN:-}" ] && return 2
347
+ [ -z "${CLOUDFLARE_ACCOUNT_ID:-}${CF_ACCOUNT_ID:-}" ] && return 2
348
+ local model="${CLOUDFLARE_MODEL:-gpt-oss}"
349
+ log "→ Cloudflare WAI: $model (free)"
350
+ local out
351
+ out=$(echo "$QUERY" | "$HOME/.surrogate/bin/cloudflare-bridge.sh" --model "$model" 2>>/tmp/ai-fallback.err) || return 1
352
+ [ -z "$out" ] && return 1
353
+ echo "$out"
354
+ save_response "cloudflare" "$model" "$out"
355
+ return 0
356
+ }
357
+
358
+ # --- Local Ollama — always-on, always-free ultimate fallback ---
359
+ # Bench (M3 24GB): granite4:7b-a1b-h (4.2GB, ~7s/fib+memo — fast & correct).
360
+ # Task-aware: code → qwen-coder:7b, chat → granite, tiny → qwen:3b.
361
+ # gemma4:26b BLOCKED — user directive (too slow for this hw).
362
+ try_granite() {
363
+ # Check ollama running
364
+ /usr/bin/curl -sS --max-time 3 http://localhost:11434/api/tags > /dev/null 2>&1 || return 2
365
+ local alias="${LOCAL_MODEL:-granite}"
366
+ log "→ Local Ollama: $alias (free, always-on)"
367
+ local out
368
+ out=$(echo "$QUERY" | "$HOME/.surrogate/bin/granite-bridge.sh" --model "$alias" 2>>/tmp/ai-fallback.err) || return 1
369
+ [ -z "$out" ] && return 1
370
+ echo "$out"
371
+ save_response "ollama-local" "$alias" "$out"
372
+ return 0
373
+ }
374
+
375
+ # --- Execute chain (FREE-FIRST for routine/bulk tasks) ---
376
+ # Order: free APIs → claude-sonnet (Max plan safety net) → local Ollama (ultimate backstop)
377
+ # IMPORTANT-tasks (retro/sprint/skill-sanitize/agent-critic/security-audit/mythos-audit)
378
+ # → call claude-bridge.sh --model opus --force DIRECTLY, bypass this chain
379
+ # REVIEWER/hallucination-check → call claude-bridge.sh --model sonnet DIRECTLY
380
+ # Paid OpenRouter removed per user direction (use Max plan instead of pay-per-use)
381
+ PROVIDERS="github sambanova cloudflare groq openrouter-free gemini claude-sonnet granite"
382
+
383
+ # Explicit --force
384
+ if [ -n "$FORCE" ]; then
385
+ case "$FORCE" in
386
+ claude-opus|opus) try_claude_opus && exit 0 ;;
387
+ claude-sonnet|sonnet) try_claude_sonnet && exit 0 ;;
388
+ openrouter|or) try_openrouter && exit 0 ;;
389
+ openrouter-free|free) try_openrouter_free && exit 0 ;;
390
+ gpt5|gpt) OPENROUTER_MODEL="openai/gpt-5.4" try_openrouter && exit 0 ;;
391
+ grok) OPENROUTER_MODEL="x-ai/grok-4.20" try_openrouter && exit 0 ;;
392
+ deepseek) OPENROUTER_MODEL="deepseek/deepseek-v3.2" try_openrouter && exit 0 ;;
393
+ gemini) try_gemini && exit 0 ;;
394
+ groq) try_groq && exit 0 ;;
395
+ github|gh) try_github && exit 0 ;;
396
+ sambanova|samba) try_sambanova && exit 0 ;;
397
+ cloudflare|cf) try_cloudflare && exit 0 ;;
398
+ granite|local|ollama) try_granite && exit 0 ;;
399
+ *) echo "[error] unknown --force '$FORCE'" >&2; exit 1 ;;
400
+ esac
401
+ echo "[error] forced provider failed" >&2; exit 1
402
+ fi
403
+
404
+ # Auto chain with skip support
405
+ for p in $PROVIDERS; do
406
+ if [ -n "$SKIP" ] && [ "$p" = "$SKIP" ]; then continue; fi
407
+ case "$p" in
408
+ github) try_github && exit 0 ;;
409
+ sambanova) try_sambanova && exit 0 ;;
410
+ cloudflare) try_cloudflare && exit 0 ;;
411
+ claude-opus) try_claude_opus && exit 0 ;;
412
+ claude-sonnet) try_claude_sonnet && exit 0 ;;
413
+ openrouter) try_openrouter && exit 0 ;;
414
+ openrouter-free) try_openrouter_free && exit 0 ;;
415
+ gemini) try_gemini && exit 0 ;;
416
+ groq) try_groq && exit 0 ;;
417
+ granite) try_granite && exit 0 ;;
418
+ esac
419
+ done
420
+
421
+ echo "[error] all providers exhausted" >&2
422
+ exit 1
bin/ask-sqlite.py ADDED
@@ -0,0 +1,175 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Local RAG assistant — SQLite FTS5 (replaces Chroma) + local LLM.
4
+ Stable, no Rust crashes, fast.
5
+
6
+ Usage:
7
+ ask-sqlite.py "คำถาม" # single shot
8
+ ask-sqlite.py -i # interactive
9
+ ask-sqlite.py --source code "คำถาม" # filter by source
10
+ ask-sqlite.py --project Vanguard "คำถาม"
11
+ """
12
+ import sys, json, sqlite3, argparse, subprocess, urllib.request, re
13
+ from pathlib import Path
14
+
15
+ DB = str(Path.home() / ".surrogate/index.db")
16
+ OLLAMA = "http://localhost:11434/api/chat"
17
+ DEFAULT_MODEL = "granite4:7b-a1b-h"
18
+
19
+ AXENTX = Path("/Users/Ashira/axentx")
20
+ PROJECTS = ["Costinel", "Vanguard", "arkship", "surrogate", "workio"]
21
+
22
+
23
+ def fts_escape(query: str) -> str:
24
+ """Turn a natural query into FTS5 MATCH syntax — use each non-trivial word."""
25
+ words = re.findall(r"\w{3,}", query) # keep alnum words ≥3 chars
26
+ if not words: return '"placeholder"'
27
+ # OR query for flexibility
28
+ return " OR ".join(f'"{w}"' for w in words[:10])
29
+
30
+
31
+ def search(query: str, n: int = 10, source: str = None, project: str = None):
32
+ conn = sqlite3.connect(DB)
33
+ conn.row_factory = sqlite3.Row
34
+ fts_q = fts_escape(query)
35
+ sql = """
36
+ SELECT d.source, d.project, d.path, d.topic, d.instruction, d.response,
37
+ rank
38
+ FROM docs_fts f JOIN docs d ON f.rowid = d.id
39
+ WHERE docs_fts MATCH ?
40
+ """
41
+ params = [fts_q]
42
+ if source:
43
+ sql += " AND d.source LIKE ?"
44
+ params.append(f"%{source}%")
45
+ if project:
46
+ sql += " AND d.project LIKE ?"
47
+ params.append(f"%{project}%")
48
+ sql += " ORDER BY rank LIMIT ?"
49
+ params.append(n)
50
+
51
+ try:
52
+ rows = conn.execute(sql, params).fetchall()
53
+ except sqlite3.OperationalError as e:
54
+ # FTS syntax error — fallback to LIKE
55
+ conn = sqlite3.connect(DB)
56
+ conn.row_factory = sqlite3.Row
57
+ rows = conn.execute(
58
+ "SELECT source, project, path, topic, instruction, response FROM docs "
59
+ "WHERE instruction LIKE ? OR response LIKE ? LIMIT ?",
60
+ (f"%{query[:80]}%", f"%{query[:80]}%", n)
61
+ ).fetchall()
62
+ return rows
63
+
64
+
65
+ def agents_md() -> str:
66
+ parts = []
67
+ for proj in PROJECTS:
68
+ md = AXENTX / proj / "AGENTS.md"
69
+ if md.exists():
70
+ parts.append(f"=== {proj}/AGENTS.md ===\n" + "\n".join(md.read_text().split("\n")[:15]))
71
+ return "\n\n".join(parts)
72
+
73
+
74
+ def git_recent() -> str:
75
+ out = []
76
+ for proj in PROJECTS:
77
+ p = AXENTX / proj
78
+ if not (p / ".git").exists(): continue
79
+ try:
80
+ r = subprocess.run(["git","-C",str(p),"log","--oneline","-5"],
81
+ capture_output=True, text=True, timeout=3)
82
+ if r.stdout.strip():
83
+ out.append(f"=== {proj} ===\n{r.stdout.strip()}")
84
+ except: pass
85
+ return "\n".join(out)
86
+
87
+
88
+ def build_context(question, source=None, project=None):
89
+ parts = ["## AGENTS.md\n" + agents_md()]
90
+ g = git_recent()
91
+ if g: parts.append("## Recent commits\n" + g)
92
+
93
+ rows = search(question, n=8, source=source, project=project)
94
+ if rows:
95
+ hits = []
96
+ for r in rows:
97
+ tag = r["source"] or "?"
98
+ path = r["path"] or ""
99
+ proj = r["project"] or ""
100
+ content = r["response"] or r["instruction"] or ""
101
+ hits.append(f"[{tag}:{proj}/{path[-60:]}]\n{content[:500]}")
102
+ parts.append(f"## Relevant docs (SQLite FTS, {len(rows)} matches)\n" + "\n\n".join(hits))
103
+ return "\n\n".join(parts)[:12000]
104
+
105
+
106
+ SYSTEM_PROMPT = (
107
+ "คุณคือ local assistant ตอบจาก Context เท่านั้น. ไม่รู้ก็บอก. "
108
+ "ภาษาไทย กระชับ. อ้าง path/source ที่เกี่ยวข้อง."
109
+ )
110
+
111
+
112
+ def ask_ollama(messages, model):
113
+ payload = {"model": model, "messages": messages, "stream": False}
114
+ req = urllib.request.Request(OLLAMA, data=json.dumps(payload).encode(),
115
+ headers={"Content-Type": "application/json"})
116
+ with urllib.request.urlopen(req, timeout=180) as r:
117
+ return json.loads(r.read()).get("message", {}).get("content", "(no response)")
118
+
119
+
120
+ def single(question, model, source, project):
121
+ print(f"🔍 SQLite FTS search...", file=sys.stderr)
122
+ ctx = build_context(question, source, project)
123
+ print(f" context: {len(ctx)} chars", file=sys.stderr)
124
+ print(f"🤖 {model}\n", file=sys.stderr)
125
+ msgs = [
126
+ {"role": "system", "content": SYSTEM_PROMPT},
127
+ {"role": "user", "content": f"### Context\n{ctx}\n\n### คำถาม\n{question}"},
128
+ ]
129
+ print(ask_ollama(msgs, model))
130
+
131
+
132
+ def interactive(model, source, project):
133
+ print(f"🤖 Interactive — {model}, source={source}, project={project}", file=sys.stderr)
134
+ print(f" type 'exit' to quit, ':s <src>' to set source filter", file=sys.stderr)
135
+ history = [{"role": "system", "content": SYSTEM_PROMPT}]
136
+ base_ctx = None
137
+ while True:
138
+ try: q = input("❯ ").strip()
139
+ except (EOFError, KeyboardInterrupt): break
140
+ if not q or q in ("exit","quit"): break
141
+ if q.startswith(":s "):
142
+ source = q[3:].strip() or None
143
+ print(f" source filter: {source}")
144
+ continue
145
+
146
+ ctx = build_context(q, source, project)
147
+ msgs = history + [{"role": "user", "content": f"### Context\n{ctx}\n\n### คำถาม\n{q}"}]
148
+ ans = ask_ollama(msgs, model)
149
+ history.append({"role": "user", "content": q})
150
+ history.append({"role": "assistant", "content": ans})
151
+ print(f"\n{ans}\n")
152
+ if len(history) > 11:
153
+ history = [history[0]] + history[-10:]
154
+
155
+
156
+ def main():
157
+ ap = argparse.ArgumentParser()
158
+ ap.add_argument("-i", "--interactive", action="store_true")
159
+ ap.add_argument("-m", "--model", default=DEFAULT_MODEL)
160
+ ap.add_argument("--source", help="filter by source (code, github-public, claude-conversation, ...)")
161
+ ap.add_argument("--project", help="filter by project")
162
+ ap.add_argument("question", nargs="*")
163
+ args = ap.parse_args()
164
+
165
+ if args.interactive:
166
+ interactive(args.model, args.source, args.project)
167
+ else:
168
+ if not args.question:
169
+ print("usage: ask 'คำถาม' OR ask -i OR ask --source code 'คำถาม'", file=sys.stderr)
170
+ sys.exit(1)
171
+ single(" ".join(args.question), args.model, args.source, args.project)
172
+
173
+
174
+ if __name__ == "__main__":
175
+ main()
bin/auto-orchestrate-loop.sh CHANGED
@@ -9,7 +9,7 @@
9
  set -uo pipefail
10
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
11
 
12
- LOG="$HOME/.claude/logs/auto-orchestrate-loop.log"
13
  mkdir -p "$(dirname "$LOG")"
14
 
15
  # Resource guard: 20% headroom
@@ -107,14 +107,14 @@ TASK_DESC="Resolve this TODO/FIXME in $PROJ_NAME at $FILE:$LINE: \"$CONTENT\". I
107
  cd "$PROJECT" || { echo "[$(date +%H:%M:%S)] cd failed" >> "$LOG"; exit 1; }
108
 
109
  # Run the orchestrate pipeline (auto-commits on APPROVE)
110
- bash "$HOME/.claude/bin/surrogate-orchestrate.sh" "$TASK_DESC" >> "$LOG" 2>&1
111
  RC=$?
112
  DUR=$(( $(date +%s) - START ))
113
 
114
  echo "[$(date +%H:%M:%S)] orchestrate done in ${DUR}s rc=$RC" >> "$LOG"
115
 
116
  # Discord notification
117
- NOTIFY="$HOME/.claude/bin/notify-discord.sh"
118
  if [[ -x "$NOTIFY" ]]; then
119
  if [[ $RC -eq 0 ]]; then
120
  "$NOTIFY" task "Auto-orchestrate: $PROJ_NAME" "$FILE:$LINE — \`$(echo "$CONTENT" | head -c 80)\` · ${DUR}s" 2>/dev/null &
 
9
  set -uo pipefail
10
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
11
 
12
+ LOG="$HOME/.surrogate/logs/auto-orchestrate-loop.log"
13
  mkdir -p "$(dirname "$LOG")"
14
 
15
  # Resource guard: 20% headroom
 
107
  cd "$PROJECT" || { echo "[$(date +%H:%M:%S)] cd failed" >> "$LOG"; exit 1; }
108
 
109
  # Run the orchestrate pipeline (auto-commits on APPROVE)
110
+ bash "$HOME/.surrogate/bin/surrogate-orchestrate.sh" "$TASK_DESC" >> "$LOG" 2>&1
111
  RC=$?
112
  DUR=$(( $(date +%s) - START ))
113
 
114
  echo "[$(date +%H:%M:%S)] orchestrate done in ${DUR}s rc=$RC" >> "$LOG"
115
 
116
  # Discord notification
117
+ NOTIFY="$HOME/.surrogate/bin/notify-discord.sh"
118
  if [[ -x "$NOTIFY" ]]; then
119
  if [[ $RC -eq 0 ]]; then
120
  "$NOTIFY" task "Auto-orchestrate: $PROJ_NAME" "$FILE:$LINE — \`$(echo "$CONTENT" | head -c 80)\` · ${DUR}s" 2>/dev/null &
bin/cerebras-bridge.sh ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Cerebras bridge — fastest inference (wafer-scale), llama/qwen/gpt-oss available
3
+ set -u
4
+ MODEL="llama3.1-8b"
5
+ MAX_TOKENS=2000
6
+ TEMP=0.3
7
+ PROMPT=""
8
+
9
+ while [[ $# -gt 0 ]]; do
10
+ case "$1" in
11
+ --model)
12
+ case "$2" in
13
+ fast|small) MODEL="llama3.1-8b" ;;
14
+ big) MODEL="qwen-3-235b-a22b-instruct-2507" ;;
15
+ gpt-oss) MODEL="gpt-oss-120b" ;;
16
+ glm) MODEL="zai-glm-4.7" ;;
17
+ *) MODEL="$2" ;;
18
+ esac; shift 2 ;;
19
+ --max-tokens) MAX_TOKENS="$2"; shift 2 ;;
20
+ *) PROMPT="$*"; break ;;
21
+ esac
22
+ done
23
+ [[ -z "$PROMPT" ]] && [[ ! -t 0 ]] && PROMPT=$(cat)
24
+ [[ -z "$PROMPT" ]] && { echo "cerebras-bridge: no prompt" >&2; exit 2; }
25
+
26
+ LOG="$HOME/.surrogate/logs/cerebras-bridge.log"
27
+ mkdir -p "$(dirname "$LOG")"
28
+ set -a; source "$HOME/.hermes/.env"; set +a
29
+ echo "[$(date '+%H:%M:%S')] model=$MODEL len=${#PROMPT}" >> "$LOG"
30
+
31
+ RESPONSE=$(python3 -c "
32
+ import os
33
+ exec(open(os.path.expanduser('~/.surrogate/bin/lib/dns_fallback.py')).read())
34
+ import json, sys, os, urllib.request, urllib.error
35
+ body = {
36
+ 'model': '$MODEL',
37
+ 'messages': [{'role':'user','content': sys.stdin.read()}],
38
+ 'max_tokens': $MAX_TOKENS, 'temperature': $TEMP,
39
+ }
40
+ req = urllib.request.Request(
41
+ 'https://api.cerebras.ai/v1/chat/completions',
42
+ data=json.dumps(body).encode(),
43
+ headers={'Content-Type':'application/json', 'User-Agent':'hermes-agent/1.0', 'Authorization':'Bearer '+os.environ.get('CEREBRAS_API_KEY','')}
44
+ )
45
+ try:
46
+ with urllib.request.urlopen(req, timeout=120) as r:
47
+ d = json.load(r)
48
+ print(d.get('choices',[{}])[0].get('message',{}).get('content',''))
49
+ except urllib.error.HTTPError as e:
50
+ print(f'cerebras-bridge HTTP {e.code}: {e.read()[:200]}', file=sys.stderr)
51
+ sys.exit(e.code // 100)
52
+ except Exception as e:
53
+ print(f'cerebras-bridge error: {e}', file=sys.stderr); sys.exit(1)
54
+ " <<< "$PROMPT")
55
+ RC=$?
56
+ echo "[$(date '+%H:%M:%S')] rc=$RC bytes=${#RESPONSE}" >> "$LOG"
57
+ [[ $RC -ne 0 ]] && exit $RC
58
+ echo "$RESPONSE"
bin/chutes-bridge.sh ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Chutes.ai bridge — OpenAI-compat; free-tier, multi-model aggregator.
3
+ # Endpoint: https://llm.chutes.ai/v1/chat/completions
4
+ # Free tier: ~500 req/day, no CC, solid for Qwen/DeepSeek/Llama models.
5
+ set -u
6
+ MODEL="deepseek-ai/DeepSeek-V3.1"
7
+ MAX_TOKENS=2000
8
+ TEMP=0.3
9
+ PROMPT=""
10
+
11
+ while [[ $# -gt 0 ]]; do
12
+ case "$1" in
13
+ --model)
14
+ case "$2" in
15
+ deepseek|v3) MODEL="deepseek-ai/DeepSeek-V3.1" ;;
16
+ qwen|coder) MODEL="Qwen/Qwen3-Coder-480B-A35B-Instruct" ;;
17
+ llama|l70) MODEL="meta-llama/Llama-3.3-70B-Instruct" ;;
18
+ r1) MODEL="deepseek-ai/DeepSeek-R1" ;;
19
+ glm) MODEL="zai-org/GLM-4.6" ;;
20
+ *) MODEL="$2" ;;
21
+ esac; shift 2 ;;
22
+ --max-tokens) MAX_TOKENS="$2"; shift 2 ;;
23
+ *) PROMPT="$*"; break ;;
24
+ esac
25
+ done
26
+ [[ -z "$PROMPT" ]] && [[ ! -t 0 ]] && PROMPT=$(cat)
27
+ [[ -z "$PROMPT" ]] && { echo "chutes-bridge: no prompt" >&2; exit 2; }
28
+
29
+ LOG="$HOME/.surrogate/logs/chutes-bridge.log"
30
+ mkdir -p "$(dirname "$LOG")"
31
+ set -a; source "$HOME/.hermes/.env"; set +a
32
+ echo "[$(date '+%H:%M:%S')] model=$MODEL len=${#PROMPT}" >> "$LOG"
33
+
34
+ RESPONSE=$(python3 -c "
35
+ import os
36
+ exec(open(os.path.expanduser('~/.surrogate/bin/lib/dns_fallback.py')).read())
37
+ exec(open(os.path.expanduser('~/.surrogate/bin/lib/bridge_retry.py')).read())
38
+ import json, sys
39
+ body = {
40
+ 'model': '$MODEL',
41
+ 'messages': [{'role':'user','content': sys.stdin.read()}],
42
+ 'max_tokens': $MAX_TOKENS, 'temperature': $TEMP,
43
+ 'stream': False,
44
+ }
45
+ try:
46
+ d = request_with_retry(
47
+ 'https://llm.chutes.ai/v1/chat/completions',
48
+ data=json.dumps(body).encode(),
49
+ headers={'Content-Type':'application/json', 'User-Agent':'hermes-agent/1.0', 'Authorization':'Bearer '+os.environ.get('CHUTES_API_KEY','')},
50
+ timeout=120, max_retries=4, base_delay=3.0, open_seconds=120,
51
+ )
52
+ print(d.get('choices',[{}])[0].get('message',{}).get('content',''))
53
+ except Exception as e:
54
+ print(f'chutes-bridge error: {e}', file=sys.stderr); sys.exit(1)
55
+ " <<< "$PROMPT")
56
+ RC=$?
57
+ echo "[$(date '+%H:%M:%S')] rc=$RC bytes=${#RESPONSE}" >> "$LOG"
58
+ [[ $RC -ne 0 ]] && exit $RC
59
+ echo "$RESPONSE"
bin/cloudflare-bridge.sh ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Cloudflare Workers AI bridge — 10k neurons/day free tier
3
+ # Endpoint: https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/ai/v1 (OpenAI-compat)
4
+ # Key env: CLOUDFLARE_API_TOKEN + CLOUDFLARE_ACCOUNT_ID
5
+ # Usage: cloudflare-bridge.sh [--model MODEL] "<prompt>"
6
+ set -u
7
+ # Default: gpt-oss-120b — 120B params, highest capability on CF Workers AI free tier.
8
+ # Catalog verified 2026-04 — aliases point to models that ACTUALLY respond.
9
+ MODEL="@cf/openai/gpt-oss-120b"
10
+ MAX_TOKENS=2000
11
+ TEMP=0.3
12
+ PROMPT=""
13
+
14
+ while [[ $# -gt 0 ]]; do
15
+ case "$1" in
16
+ --model)
17
+ case "$2" in
18
+ fast|small|8b) MODEL="@cf/meta/llama-3.1-8b-instruct-fp8" ;;
19
+ llama|llama70|70b) MODEL="@cf/meta/llama-3.3-70b-instruct-fp8-fast" ;;
20
+ gpt-oss|oss|120b) MODEL="@cf/openai/gpt-oss-120b" ;;
21
+ deepseek|r1|reasoning) MODEL="@cf/deepseek-ai/deepseek-r1-distill-qwen-32b" ;;
22
+ kimi|long-ctx) MODEL="@cf/moonshotai/kimi-k2.6" ;;
23
+ glm|glm4) MODEL="@cf/zai-org/glm-4.7-flash" ;;
24
+ *) MODEL="$2" ;;
25
+ esac; shift 2 ;;
26
+ --max-tokens) MAX_TOKENS="$2"; shift 2 ;;
27
+ --temperature) TEMP="$2"; shift 2 ;;
28
+ *) PROMPT="$*"; break ;;
29
+ esac
30
+ done
31
+ [[ -z "$PROMPT" ]] && [[ ! -t 0 ]] && PROMPT=$(cat)
32
+ [[ -z "$PROMPT" ]] && { echo "cloudflare-bridge: no prompt" >&2; exit 2; }
33
+
34
+ LOG="$HOME/.surrogate/logs/cloudflare-bridge.log"
35
+ mkdir -p "$(dirname "$LOG")"
36
+ set -a; source "$HOME/.hermes/.env" 2>/dev/null || true; set +a
37
+
38
+ TOKEN="${CLOUDFLARE_API_TOKEN:-${CF_API_TOKEN:-}}"
39
+ ACCOUNT="${CLOUDFLARE_ACCOUNT_ID:-${CF_ACCOUNT_ID:-}}"
40
+ if [[ -z "$TOKEN" ]] || [[ -z "$ACCOUNT" ]]; then
41
+ echo "cloudflare-bridge: missing CLOUDFLARE_API_TOKEN or CLOUDFLARE_ACCOUNT_ID in ~/.hermes/.env" >&2
42
+ exit 3
43
+ fi
44
+
45
+ echo "[$(date '+%H:%M:%S')] model=$MODEL len=${#PROMPT}" >> "$LOG"
46
+
47
+ RESPONSE=$(CF_TOKEN="$TOKEN" CF_ACCOUNT="$ACCOUNT" python3 -c "
48
+ import os
49
+ exec(open(os.path.expanduser('~/.surrogate/bin/lib/dns_fallback.py')).read())
50
+ exec(open(os.path.expanduser('~/.surrogate/bin/lib/bridge_retry.py')).read())
51
+ import json, sys
52
+ body = {
53
+ 'model': '$MODEL',
54
+ 'messages': [{'role':'user','content': sys.stdin.read()}],
55
+ 'max_tokens': $MAX_TOKENS, 'temperature': $TEMP,
56
+ }
57
+ url = f\"https://api.cloudflare.com/client/v4/accounts/{os.environ['CF_ACCOUNT']}/ai/v1/chat/completions\"
58
+ try:
59
+ d = request_with_retry(
60
+ url,
61
+ data=json.dumps(body).encode(),
62
+ headers={
63
+ 'Content-Type':'application/json',
64
+ 'User-Agent':'hermes-agent/1.0',
65
+ 'Authorization':'Bearer '+os.environ['CF_TOKEN'],
66
+ },
67
+ timeout=120, max_retries=6, base_delay=5.0, open_seconds=180,
68
+ )
69
+ print(d.get('choices',[{}])[0].get('message',{}).get('content',''))
70
+ except Exception as e:
71
+ print(f'cloudflare-bridge error: {e}', file=sys.stderr); sys.exit(1)
72
+ " <<< "$PROMPT")
73
+ RC=$?
74
+ echo "[$(date '+%H:%M:%S')] rc=$RC bytes=${#RESPONSE}" >> "$LOG"
75
+ [[ $RC -ne 0 ]] && exit $RC
76
+ echo "$RESPONSE"
bin/crawl-rss.py CHANGED
@@ -5,7 +5,7 @@ Reads feed URLs from FEEDS env or default list, parses entries, writes JSONL
5
  to output file. Only writes entries not seen before (dedup by URL).
6
 
7
  Usage (from bash):
8
- OUT=/tmp/out.jsonl python3 ~/.claude/bin/crawl-rss.py
9
 
10
  All feeds VERIFIED to return 200 as of 2026-04-19. Failures are logged,
11
  not fatal — one bad feed doesn't kill the rest.
@@ -86,7 +86,7 @@ FEEDS: list[tuple[str, str]] = [
86
  ]
87
 
88
  OUT_PATH = os.environ.get("OUT", "/tmp/rss-crawl.jsonl")
89
- SEEN_PATH = os.environ.get("SEEN", os.path.expanduser("~/.claude/.rss-seen.json"))
90
  MAX_ENTRIES_PER_FEED = int(os.environ.get("MAX_PER_FEED", "10"))
91
  TIMEOUT = int(os.environ.get("TIMEOUT", "15"))
92
 
 
5
  to output file. Only writes entries not seen before (dedup by URL).
6
 
7
  Usage (from bash):
8
+ OUT=/tmp/out.jsonl python3 ~/.surrogate/bin/crawl-rss.py
9
 
10
  All feeds VERIFIED to return 200 as of 2026-04-19. Failures are logged,
11
  not fatal — one bad feed doesn't kill the rest.
 
86
  ]
87
 
88
  OUT_PATH = os.environ.get("OUT", "/tmp/rss-crawl.jsonl")
89
+ SEEN_PATH = os.environ.get("SEEN", os.path.expanduser("~/.surrogate/.rss-seen.json"))
90
  MAX_ENTRIES_PER_FEED = int(os.environ.get("MAX_PER_FEED", "10"))
91
  TIMEOUT = int(os.environ.get("TIMEOUT", "15"))
92
 
bin/daily-crawl.sh CHANGED
@@ -14,17 +14,17 @@ while [ $# -gt 0 ]; do
14
  done
15
 
16
  export PATH=/usr/bin:/bin:/usr/local/bin:/opt/homebrew/bin:$PATH
17
- source ~/.claude/.env 2>/dev/null || true
18
  # Also source ~/.hermes/.env (where Surrogate keeps the live tokens)
19
  set -a; source ~/.hermes/.env 2>/dev/null || true; set +a
20
 
21
  DATE=$(date +%Y-%m-%d)
22
  CRAWL_DIR="$HOME/Documents/Obsidian Vault/AI-Hub/crawls/$DATE"
23
- mkdir -p "$CRAWL_DIR/raw" "$HOME/.claude/logs"
24
- LOG="$HOME/.claude/logs/crawl-$DATE.log"
25
  log() { echo "[$(date +%H:%M:%S)] $*" | tee -a "$LOG"; }
26
 
27
- PY=~/.claude/venv/bin/python
28
 
29
  # ═══════════ SOURCES — use Python scripts with explicit env passing ═══════════
30
 
@@ -403,6 +403,6 @@ for d in dirs[:60]:
403
  PY
404
 
405
  # Graph sync (async)
406
- [ -x "$HOME/.claude/bin/graph-sync.sh" ] && ("$HOME/.claude/bin/graph-sync.sh" > /dev/null 2>&1 &) || true
407
 
408
  log "=== Done: $CRAWL_DIR/digest.md ==="
 
14
  done
15
 
16
  export PATH=/usr/bin:/bin:/usr/local/bin:/opt/homebrew/bin:$PATH
17
+ source ~/.hermes/.env 2>/dev/null || true
18
  # Also source ~/.hermes/.env (where Surrogate keeps the live tokens)
19
  set -a; source ~/.hermes/.env 2>/dev/null || true; set +a
20
 
21
  DATE=$(date +%Y-%m-%d)
22
  CRAWL_DIR="$HOME/Documents/Obsidian Vault/AI-Hub/crawls/$DATE"
23
+ mkdir -p "$CRAWL_DIR/raw" "$HOME/.surrogate/logs"
24
+ LOG="$HOME/.surrogate/logs/crawl-$DATE.log"
25
  log() { echo "[$(date +%H:%M:%S)] $*" | tee -a "$LOG"; }
26
 
27
+ PY=~/.surrogate/venv/bin/python
28
 
29
  # ═══════════ SOURCES — use Python scripts with explicit env passing ═══════════
30
 
 
403
  PY
404
 
405
  # Graph sync (async)
406
+ [ -x "$HOME/.surrogate/bin/graph-sync.sh" ] && ("$HOME/.surrogate/bin/graph-sync.sh" > /dev/null 2>&1 &) || true
407
 
408
  log "=== Done: $CRAWL_DIR/digest.md ==="
bin/dataset-enrich.sh CHANGED
@@ -17,13 +17,13 @@
17
  set -uo pipefail
18
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
19
 
20
- LOG="$HOME/.claude/logs/dataset-enrich.log"
21
  WORK="$HOME/.hermes/workspace/dataset-enrich"
22
  mkdir -p "$WORK" "$(dirname "$LOG")"
23
 
24
  echo "[$(date +%H:%M:%S)] dataset enrich start" | tee "$LOG"
25
 
26
- ~/.claude/venv/bin/python <<'PYEOF' 2>&1 | tee -a "$LOG"
27
  from huggingface_hub import HfApi
28
  from pathlib import Path
29
  from datasets import load_dataset
 
17
  set -uo pipefail
18
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
19
 
20
+ LOG="$HOME/.surrogate/logs/dataset-enrich.log"
21
  WORK="$HOME/.hermes/workspace/dataset-enrich"
22
  mkdir -p "$WORK" "$(dirname "$LOG")"
23
 
24
  echo "[$(date +%H:%M:%S)] dataset enrich start" | tee "$LOG"
25
 
26
+ ~/.surrogate/venv/bin/python <<'PYEOF' 2>&1 | tee -a "$LOG"
27
  from huggingface_hub import HfApi
28
  from pathlib import Path
29
  from datasets import load_dataset
bin/dev-cloud-daemon.sh CHANGED
@@ -8,7 +8,7 @@ set -u
8
 
9
  PROVIDER="${1:?usage: dev-cloud-daemon.sh <github|samba|cloudflare|groq|gemini>}"
10
 
11
- LOG="$HOME/.claude/logs/dev-cloud-daemon-${PROVIDER}.log"
12
  mkdir -p "$(dirname "$LOG")"
13
 
14
  # Redis connection: prefer Unix socket, fall back to TCP 127.0.0.1:6379.
@@ -65,15 +65,15 @@ except: print('OK')" 2>/dev/null)
65
  # and works on exactly what the daemon locked (avoids "no free priority"
66
  # dead-ends when the file lock was touched earlier for this same PRIO_ID).
67
  HERMES_PRIO_ID="$PRIO_ID" \
68
- "$HOME/.claude/bin/dev-cloud-worker.sh" "$PROVIDER" 2>&1 | tail -3 >> "$LOG"
69
  RC=${PIPESTATUS[0]}
70
  DUR=$(( $(date +%s) - START ))
71
  echo "[$(date '+%H:%M:%S')] $PROVIDER $PRIO_ID done in ${DUR}s (rc=$RC)" >> "$LOG"
72
 
73
  # Discord: only notify failures + slow tasks (avoid spam on every success)
74
  if [[ $RC -ne 0 ]]; then
75
- "$HOME/.claude/bin/notify-discord.sh" error "Worker failed" "$PROVIDER · $PRIO_ID · ${DUR}s · rc=$RC" 2>/dev/null &
76
  elif [[ $DUR -gt 240 ]]; then
77
- "$HOME/.claude/bin/notify-discord.sh" warn "Slow task" "$PROVIDER · $PRIO_ID · ${DUR}s" 2>/dev/null &
78
  fi
79
  done
 
8
 
9
  PROVIDER="${1:?usage: dev-cloud-daemon.sh <github|samba|cloudflare|groq|gemini>}"
10
 
11
+ LOG="$HOME/.surrogate/logs/dev-cloud-daemon-${PROVIDER}.log"
12
  mkdir -p "$(dirname "$LOG")"
13
 
14
  # Redis connection: prefer Unix socket, fall back to TCP 127.0.0.1:6379.
 
65
  # and works on exactly what the daemon locked (avoids "no free priority"
66
  # dead-ends when the file lock was touched earlier for this same PRIO_ID).
67
  HERMES_PRIO_ID="$PRIO_ID" \
68
+ "$HOME/.surrogate/bin/dev-cloud-worker.sh" "$PROVIDER" 2>&1 | tail -3 >> "$LOG"
69
  RC=${PIPESTATUS[0]}
70
  DUR=$(( $(date +%s) - START ))
71
  echo "[$(date '+%H:%M:%S')] $PROVIDER $PRIO_ID done in ${DUR}s (rc=$RC)" >> "$LOG"
72
 
73
  # Discord: only notify failures + slow tasks (avoid spam on every success)
74
  if [[ $RC -ne 0 ]]; then
75
+ "$HOME/.surrogate/bin/notify-discord.sh" error "Worker failed" "$PROVIDER · $PRIO_ID · ${DUR}s · rc=$RC" 2>/dev/null &
76
  elif [[ $DUR -gt 240 ]]; then
77
+ "$HOME/.surrogate/bin/notify-discord.sh" warn "Slow task" "$PROVIDER · $PRIO_ID · ${DUR}s" 2>/dev/null &
78
  fi
79
  done
bin/dev-cloud-worker.sh CHANGED
@@ -7,17 +7,17 @@
7
  # provider = github | samba | cloudflare | groq | gemini
8
  #
9
  # Rate-limit aware per provider (set by cron schedule, NOT inside script).
10
- # Cross-worker coordination: lockfile per (priority, provider) in ~/.claude/state/dev-locks/
11
  # Global priority lock: 30-min window, so same priority only gets fresh attempt per provider
12
  # every 30 min (prevents redundant work, allows tournament of implementations over time).
13
  set -u
14
 
15
  PROVIDER="${1:?usage: dev-cloud-worker.sh <github|samba|cloudflare|groq|gemini>}"
16
 
17
- LOG="$HOME/.claude/logs/dev-cloud-$PROVIDER.log"
18
  OUT_DIR="$HOME/.hermes/workspace/dev-cloud-$PROVIDER"
19
  SHARED="$HOME/.hermes/workspace/swarm-shared"
20
- LOCK_DIR="$HOME/.claude/state/dev-locks"
21
  mkdir -p "$(dirname "$LOG")" "$OUT_DIR" "$LOCK_DIR"
22
 
23
  START=$(date +%s)
@@ -143,7 +143,7 @@ PRIO_PROJECT=$(echo "$PRIORITY" | python3 -c "import json,sys; print(json.loads(
143
  echo "[$(date '+%H:%M:%S')] $PROVIDER picked $PRIO_ID ($PRIO_PROJECT: ${PRIO_TITLE:0:60})" >> "$LOG"
144
 
145
  # -------- Rich context injection (B: enrich with repo + similar funcs + few-shot + deltas) --------
146
- source "$HOME/.claude/bin/lib/context_builder.sh"
147
  build_rich_context "$PRIO_PROJECT" "$PRIO_ID" "$PRIO_TITLE"
148
  # Sets: REPO_MAP, SIMILAR_FUNCS, RAG_EXAMPLES, SEMANTIC_RAG, FEWSHOT_ACCEPTED, ANTI_PATTERNS, PROMPT_DELTAS, PRIO_SPEC
149
 
@@ -285,37 +285,37 @@ case "$PROVIDER" in
285
  github)
286
  # Codestral-2501 is Mistral's dedicated code model — free via PAT, top-tier for code tasks.
287
  # Better than gpt-4o-mini for coding specifically. Budget-aware: falls through if HALT.
288
- RESULT=$(echo "$PROMPT" | "$HOME/.claude/bin/github-bridge.sh" --model codestral 2>>"$LOG")
289
  ;;
290
  samba|sambanova)
291
- RESULT=$(echo "$PROMPT" | "$HOME/.claude/bin/sambanova-bridge.sh" --model deepseek 2>>"$LOG")
292
  ;;
293
  cloudflare|cf)
294
- RESULT=$(echo "$PROMPT" | "$HOME/.claude/bin/cloudflare-bridge.sh" --model deepseek 2>>"$LOG")
295
  ;;
296
  groq)
297
- RESULT=$(echo "$PROMPT" | "$HOME/.claude/bin/groq-bridge.sh" --model qwen 2>>"$LOG")
298
  ;;
299
  gemini)
300
  # Use ai-fallback's gemini path
301
- RESULT=$(echo "$PROMPT" | "$HOME/.claude/bin/ai-fallback.sh" --force gemini 2>>"$LOG")
302
  ;;
303
  cerebras)
304
  # Wafer-scale — fastest inference on planet (~2000 tok/s). Qwen3 235B excellent for code.
305
- RESULT=$(echo "$PROMPT" | "$HOME/.claude/bin/cerebras-bridge.sh" --model big 2>>"$LOG")
306
  ;;
307
  nvidia|nim)
308
  # NVIDIA NIM — Llama 3.3 70B, diverse model pool (Nemotron, DeepSeek-R1, Qwen-Coder)
309
- RESULT=$(echo "$PROMPT" | "$HOME/.claude/bin/nvidia-bridge.sh" --model qwen 2>>"$LOG")
310
  ;;
311
  chutes)
312
  # Chutes.ai aggregator — free tier needs activation; currently may 402
313
- RESULT=$(echo "$PROMPT" | "$HOME/.claude/bin/chutes-bridge.sh" --model deepseek 2>>"$LOG")
314
  ;;
315
  surrogate|surrogate-1)
316
  # น้อง — local Ollama, Ashira-personalized (Qwen2.5-Coder-7B + Thai/DevSecOps prompt)
317
  # Will be upgraded with LoRA adapter after RunPod training.
318
- RESULT=$(echo "$PROMPT" | "$HOME/.claude/bin/surrogate-bridge.sh" 2>>"$LOG")
319
  ;;
320
  *)
321
  echo "[$(date '+%H:%M:%S')] unknown provider $PROVIDER" >> "$LOG"
 
7
  # provider = github | samba | cloudflare | groq | gemini
8
  #
9
  # Rate-limit aware per provider (set by cron schedule, NOT inside script).
10
+ # Cross-worker coordination: lockfile per (priority, provider) in ~/.surrogate/state/dev-locks/
11
  # Global priority lock: 30-min window, so same priority only gets fresh attempt per provider
12
  # every 30 min (prevents redundant work, allows tournament of implementations over time).
13
  set -u
14
 
15
  PROVIDER="${1:?usage: dev-cloud-worker.sh <github|samba|cloudflare|groq|gemini>}"
16
 
17
+ LOG="$HOME/.surrogate/logs/dev-cloud-$PROVIDER.log"
18
  OUT_DIR="$HOME/.hermes/workspace/dev-cloud-$PROVIDER"
19
  SHARED="$HOME/.hermes/workspace/swarm-shared"
20
+ LOCK_DIR="$HOME/.surrogate/state/dev-locks"
21
  mkdir -p "$(dirname "$LOG")" "$OUT_DIR" "$LOCK_DIR"
22
 
23
  START=$(date +%s)
 
143
  echo "[$(date '+%H:%M:%S')] $PROVIDER picked $PRIO_ID ($PRIO_PROJECT: ${PRIO_TITLE:0:60})" >> "$LOG"
144
 
145
  # -------- Rich context injection (B: enrich with repo + similar funcs + few-shot + deltas) --------
146
+ source "$HOME/.surrogate/bin/lib/context_builder.sh"
147
  build_rich_context "$PRIO_PROJECT" "$PRIO_ID" "$PRIO_TITLE"
148
  # Sets: REPO_MAP, SIMILAR_FUNCS, RAG_EXAMPLES, SEMANTIC_RAG, FEWSHOT_ACCEPTED, ANTI_PATTERNS, PROMPT_DELTAS, PRIO_SPEC
149
 
 
285
  github)
286
  # Codestral-2501 is Mistral's dedicated code model — free via PAT, top-tier for code tasks.
287
  # Better than gpt-4o-mini for coding specifically. Budget-aware: falls through if HALT.
288
+ RESULT=$(echo "$PROMPT" | "$HOME/.surrogate/bin/github-bridge.sh" --model codestral 2>>"$LOG")
289
  ;;
290
  samba|sambanova)
291
+ RESULT=$(echo "$PROMPT" | "$HOME/.surrogate/bin/sambanova-bridge.sh" --model deepseek 2>>"$LOG")
292
  ;;
293
  cloudflare|cf)
294
+ RESULT=$(echo "$PROMPT" | "$HOME/.surrogate/bin/cloudflare-bridge.sh" --model deepseek 2>>"$LOG")
295
  ;;
296
  groq)
297
+ RESULT=$(echo "$PROMPT" | "$HOME/.surrogate/bin/groq-bridge.sh" --model qwen 2>>"$LOG")
298
  ;;
299
  gemini)
300
  # Use ai-fallback's gemini path
301
+ RESULT=$(echo "$PROMPT" | "$HOME/.surrogate/bin/ai-fallback.sh" --force gemini 2>>"$LOG")
302
  ;;
303
  cerebras)
304
  # Wafer-scale — fastest inference on planet (~2000 tok/s). Qwen3 235B excellent for code.
305
+ RESULT=$(echo "$PROMPT" | "$HOME/.surrogate/bin/cerebras-bridge.sh" --model big 2>>"$LOG")
306
  ;;
307
  nvidia|nim)
308
  # NVIDIA NIM — Llama 3.3 70B, diverse model pool (Nemotron, DeepSeek-R1, Qwen-Coder)
309
+ RESULT=$(echo "$PROMPT" | "$HOME/.surrogate/bin/nvidia-bridge.sh" --model qwen 2>>"$LOG")
310
  ;;
311
  chutes)
312
  # Chutes.ai aggregator — free tier needs activation; currently may 402
313
+ RESULT=$(echo "$PROMPT" | "$HOME/.surrogate/bin/chutes-bridge.sh" --model deepseek 2>>"$LOG")
314
  ;;
315
  surrogate|surrogate-1)
316
  # น้อง — local Ollama, Ashira-personalized (Qwen2.5-Coder-7B + Thai/DevSecOps prompt)
317
  # Will be upgraded with LoRA adapter after RunPod training.
318
+ RESULT=$(echo "$PROMPT" | "$HOME/.surrogate/bin/surrogate-bridge.sh" 2>>"$LOG")
319
  ;;
320
  *)
321
  echo "[$(date '+%H:%M:%S')] unknown provider $PROVIDER" >> "$LOG"
bin/domain-scrape-loop.sh CHANGED
@@ -8,10 +8,10 @@
8
  set -u
9
  DUR="${1:-900}"
10
  PARALLEL="${2:-3}"
11
- LOG="$HOME/.claude/logs/domain-scrape-loop.log"
12
  START=$(date +%s)
13
  BEFORE_PAIRS=$(wc -l ~/axentx/surrogate/data/training-jsonl/*.jsonl 2>/dev/null | tail -1 | awk '{print $1}')
14
- BEFORE_LEDGER=$(sqlite3 ~/.claude/state/scrape-ledger.db "SELECT COUNT(*) FROM scraped" 2>/dev/null)
15
 
16
  echo "═══ LOOP START $(date +%H:%M:%S) duration=${DUR}s parallel=$PARALLEL" | tee -a "$LOG"
17
  echo " before: pairs=$BEFORE_PAIRS ledger_repos=$BEFORE_LEDGER" | tee -a "$LOG"
@@ -33,7 +33,7 @@ while true; do
33
  # Fire N parallel instances, each picks different domain via ledger
34
  for i in $(seq 1 $PARALLEL); do
35
  (
36
- ~/.claude/bin/github-domain-scrape.sh >> "$LOG" 2>&1
37
  ) &
38
  done
39
  wait # wait all parallel to finish (30-60s typical)
@@ -44,13 +44,13 @@ while true; do
44
  # Progress every 5 iters
45
  if (( ITER % 5 == 0 )); then
46
  PAIRS=$(wc -l ~/axentx/surrogate/data/training-jsonl/*.jsonl 2>/dev/null | tail -1 | awk '{print $1}')
47
- LEDGER=$(sqlite3 ~/.claude/state/scrape-ledger.db "SELECT COUNT(*) FROM scraped" 2>/dev/null)
48
  echo " [iter=$ITER $((NOW - START))s] pairs=$PAIRS (+$((PAIRS - BEFORE_PAIRS))) ledger=$LEDGER (+$((LEDGER - BEFORE_LEDGER)))" | tee -a "$LOG"
49
  fi
50
  done
51
 
52
  AFTER_PAIRS=$(wc -l ~/axentx/surrogate/data/training-jsonl/*.jsonl 2>/dev/null | tail -1 | awk '{print $1}')
53
- AFTER_LEDGER=$(sqlite3 ~/.claude/state/scrape-ledger.db "SELECT COUNT(*) FROM scraped" 2>/dev/null)
54
  echo "═══ LOOP DONE $(date +%H:%M:%S)" | tee -a "$LOG"
55
  echo " iters: $ITER" | tee -a "$LOG"
56
  echo " pairs added: $((AFTER_PAIRS - BEFORE_PAIRS))" | tee -a "$LOG"
 
8
  set -u
9
  DUR="${1:-900}"
10
  PARALLEL="${2:-3}"
11
+ LOG="$HOME/.surrogate/logs/domain-scrape-loop.log"
12
  START=$(date +%s)
13
  BEFORE_PAIRS=$(wc -l ~/axentx/surrogate/data/training-jsonl/*.jsonl 2>/dev/null | tail -1 | awk '{print $1}')
14
+ BEFORE_LEDGER=$(sqlite3 ~/.surrogate/state/scrape-ledger.db "SELECT COUNT(*) FROM scraped" 2>/dev/null)
15
 
16
  echo "═══ LOOP START $(date +%H:%M:%S) duration=${DUR}s parallel=$PARALLEL" | tee -a "$LOG"
17
  echo " before: pairs=$BEFORE_PAIRS ledger_repos=$BEFORE_LEDGER" | tee -a "$LOG"
 
33
  # Fire N parallel instances, each picks different domain via ledger
34
  for i in $(seq 1 $PARALLEL); do
35
  (
36
+ ~/.surrogate/bin/github-domain-scrape.sh >> "$LOG" 2>&1
37
  ) &
38
  done
39
  wait # wait all parallel to finish (30-60s typical)
 
44
  # Progress every 5 iters
45
  if (( ITER % 5 == 0 )); then
46
  PAIRS=$(wc -l ~/axentx/surrogate/data/training-jsonl/*.jsonl 2>/dev/null | tail -1 | awk '{print $1}')
47
+ LEDGER=$(sqlite3 ~/.surrogate/state/scrape-ledger.db "SELECT COUNT(*) FROM scraped" 2>/dev/null)
48
  echo " [iter=$ITER $((NOW - START))s] pairs=$PAIRS (+$((PAIRS - BEFORE_PAIRS))) ledger=$LEDGER (+$((LEDGER - BEFORE_LEDGER)))" | tee -a "$LOG"
49
  fi
50
  done
51
 
52
  AFTER_PAIRS=$(wc -l ~/axentx/surrogate/data/training-jsonl/*.jsonl 2>/dev/null | tail -1 | awk '{print $1}')
53
+ AFTER_LEDGER=$(sqlite3 ~/.surrogate/state/scrape-ledger.db "SELECT COUNT(*) FROM scraped" 2>/dev/null)
54
  echo "═══ LOOP DONE $(date +%H:%M:%S)" | tee -a "$LOG"
55
  echo " iters: $ITER" | tee -a "$LOG"
56
  echo " pairs added: $((AFTER_PAIRS - BEFORE_PAIRS))" | tee -a "$LOG"
bin/github-bridge.sh ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # GitHub Models bridge — free-tier GPT-4o / Llama 3.3 / Mistral via GitHub PAT
3
+ # Endpoint: https://models.github.ai/inference (OpenAI-compat)
4
+ # Key env: GITHUB_MODELS_TOKEN (preferred) or GITHUB_TOKEN
5
+ # Usage: github-bridge.sh [--model MODEL] "<prompt>" | echo "..." | github-bridge.sh
6
+ set -u
7
+ # Default: full GPT-4o (free via PAT, far smarter than mini, same daily quota)
8
+ MODEL="openai/gpt-4o"
9
+ MAX_TOKENS=2000
10
+ TEMP=0.3
11
+ PROMPT=""
12
+
13
+ # Aliases reflect ONLY models verified working with free PAT (2026-04).
14
+ # GPT-5/o3/o1-mini etc. appear in /catalog but API returns 403/unavailable — not usable.
15
+ while [[ $# -gt 0 ]]; do
16
+ case "$1" in
17
+ --model)
18
+ case "$2" in
19
+ # OpenAI
20
+ gpt4o|gpt-4o) MODEL="openai/gpt-4o" ;;
21
+ mini|gpt-4o-mini) MODEL="openai/gpt-4o-mini" ;;
22
+ gpt41|gpt-4.1) MODEL="openai/gpt-4.1" ;;
23
+ gpt41-mini|gpt-4.1-mini) MODEL="openai/gpt-4.1-mini" ;;
24
+ # Meta Llama
25
+ llama|llama70) MODEL="meta/Llama-3.3-70B-Instruct" ;;
26
+ llama4|maverick) MODEL="meta/llama-4-maverick-17b-128e-instruct-fp8" ;;
27
+ llama405) MODEL="meta/meta-llama-3.1-405b-instruct" ;;
28
+ # DeepSeek
29
+ deepseek|deepseek-v3) MODEL="deepseek/deepseek-v3-0324" ;;
30
+ deepseek-r1|r1|reasoning) MODEL="deepseek/DeepSeek-R1" ;;
31
+ deepseek-r1-latest) MODEL="deepseek/deepseek-r1-0528" ;;
32
+ # xAI
33
+ grok|grok3) MODEL="xai/grok-3" ;;
34
+ grok-mini) MODEL="xai/grok-3-mini" ;;
35
+ # Mistral
36
+ mistral|mistral-medium) MODEL="mistral-ai/mistral-medium-2505" ;;
37
+ codestral|code) MODEL="mistral-ai/codestral-2501" ;;
38
+ # Microsoft Phi
39
+ phi|phi4) MODEL="microsoft/phi-4" ;;
40
+ # Cohere
41
+ cohere|command-a) MODEL="cohere/cohere-command-a" ;;
42
+ command-r) MODEL="cohere/cohere-command-r-plus-08-2024" ;;
43
+ *) MODEL="$2" ;;
44
+ esac; shift 2 ;;
45
+ --max-tokens) MAX_TOKENS="$2"; shift 2 ;;
46
+ --temperature) TEMP="$2"; shift 2 ;;
47
+ *) PROMPT="$*"; break ;;
48
+ esac
49
+ done
50
+ [[ -z "$PROMPT" ]] && [[ ! -t 0 ]] && PROMPT=$(cat)
51
+ [[ -z "$PROMPT" ]] && { echo "github-bridge: no prompt" >&2; exit 2; }
52
+
53
+ LOG="$HOME/.surrogate/logs/github-bridge.log"
54
+ mkdir -p "$(dirname "$LOG")"
55
+ set -a; source "$HOME/.hermes/.env" 2>/dev/null || true; set +a
56
+
57
+ # Prefer dedicated models token, fall back to general PAT
58
+ TOKEN="${GITHUB_MODELS_TOKEN:-${GITHUB_TOKEN:-}}"
59
+ if [[ -z "$TOKEN" ]]; then
60
+ echo "github-bridge: missing GITHUB_MODELS_TOKEN or GITHUB_TOKEN in ~/.hermes/.env" >&2
61
+ exit 3
62
+ fi
63
+
64
+ echo "[$(date '+%H:%M:%S')] model=$MODEL len=${#PROMPT}" >> "$LOG"
65
+
66
+ RESPONSE=$(GH_TOKEN="$TOKEN" python3 -c "
67
+ import os
68
+ exec(open(os.path.expanduser('~/.surrogate/bin/lib/dns_fallback.py')).read())
69
+ exec(open(os.path.expanduser('~/.surrogate/bin/lib/bridge_retry.py')).read())
70
+ import json, sys
71
+ body = {
72
+ 'model': '$MODEL',
73
+ 'messages': [{'role':'user','content': sys.stdin.read()}],
74
+ 'max_tokens': $MAX_TOKENS, 'temperature': $TEMP,
75
+ }
76
+ try:
77
+ d = request_with_retry(
78
+ 'https://models.github.ai/inference/chat/completions',
79
+ data=json.dumps(body).encode(),
80
+ headers={
81
+ 'Content-Type':'application/json',
82
+ 'User-Agent':'hermes-agent/1.0',
83
+ 'Authorization':'Bearer '+os.environ['GH_TOKEN'],
84
+ },
85
+ timeout=120, max_retries=4, base_delay=2.0,
86
+ )
87
+ print(d.get('choices',[{}])[0].get('message',{}).get('content',''))
88
+ except Exception as e:
89
+ print(f'github-bridge error: {e}', file=sys.stderr); sys.exit(1)
90
+ " <<< "$PROMPT")
91
+ RC=$?
92
+ echo "[$(date '+%H:%M:%S')] rc=$RC bytes=${#RESPONSE}" >> "$LOG"
93
+ [[ $RC -ne 0 ]] && exit $RC
94
+ echo "$RESPONSE"
bin/github-domain-scrape.sh CHANGED
@@ -4,13 +4,13 @@
4
  set -u
5
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
6
 
7
- LEDGER="$HOME/.claude/state/scrape-ledger.db"
8
- LOG="$HOME/.claude/logs/github-domain-scrape.log"
9
  DATE=$(date +%Y-%m-%d)
10
  OUT="$HOME/axentx/surrogate/data/training-jsonl/github-domain-${DATE}.jsonl"
11
  mkdir -p "$(dirname "$LOG")" "$(dirname "$OUT")"
12
 
13
- [[ ! -f "$LEDGER" ]] && bash "$HOME/.claude/bin/scrape-ledger-init.sh"
14
 
15
  TARGET="${1:-}"
16
  export LEDGER OUT GITHUB_TOKEN GITHUB_TOKEN_POOL TARGET
 
4
  set -u
5
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
6
 
7
+ LEDGER="$HOME/.surrogate/state/scrape-ledger.db"
8
+ LOG="$HOME/.surrogate/logs/github-domain-scrape.log"
9
  DATE=$(date +%Y-%m-%d)
10
  OUT="$HOME/axentx/surrogate/data/training-jsonl/github-domain-${DATE}.jsonl"
11
  mkdir -p "$(dirname "$LOG")" "$(dirname "$OUT")"
12
 
13
+ [[ ! -f "$LEDGER" ]] && bash "$HOME/.surrogate/bin/scrape-ledger-init.sh"
14
 
15
  TARGET="${1:-}"
16
  export LEDGER OUT GITHUB_TOKEN GITHUB_TOKEN_POOL TARGET
bin/graph-sync.sh ADDED
@@ -0,0 +1,107 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # ... (original content unchanged)
3
+ # Sync Obsidian markdown patterns/knowledge → FalkorDB Lite (graph DB)
4
+ # Complements rag-index.sh (vector DB) — same sources, 2 different indexes.
5
+ set -e
6
+ PYTHON="${HOME}/.surrogate/venv/bin/python"
7
+ [ -x "$PYTHON" ] || { echo "venv not found: $PYTHON"; exit 1; }
8
+
9
+ "$PYTHON" <<'PY'
10
+ import re, os
11
+ from pathlib import Path
12
+ from redislite.falkordb_client import FalkorDB
13
+ import yaml
14
+
15
+ HOME = Path.home()
16
+ SOURCES = [
17
+ HOME / "Documents/Obsidian Vault/AI-Hub/patterns",
18
+ HOME / "Documents/Obsidian Vault/AI-Hub/knowledge",
19
+ HOME / "Documents/Obsidian Vault/AI-Hub/inbox",
20
+ HOME / ".surrogate/memory",
21
+ ]
22
+ DB_FILE = str(HOME / ".surrogate/graph-db.rdb")
23
+
24
+ db = FalkorDB(dbfilename=DB_FILE)
25
+ g = db.select_graph("ashira")
26
+
27
+ try: g.query("MATCH (n) DETACH DELETE n")
28
+ except: pass
29
+
30
+ frontmatter_re = re.compile(r'^---\n(.*?)\n---', re.DOTALL)
31
+ wikilink_re = re.compile(r'\[\[([^\]|]+?)(?:\|[^\]]+)?\]\]')
32
+
33
+ def esc(s):
34
+ return str(s).replace("\\", "\\\\").replace("'", "\\'") if s else ""
35
+
36
+ nodes = {}
37
+ edges = []
38
+
39
+ for src in SOURCES:
40
+ if not src.exists(): continue
41
+ for md in src.rglob("*.md"):
42
+ stem = md.stem
43
+ text = md.read_text(errors="ignore")
44
+ fm_match = frontmatter_re.match(text)
45
+ fm = {}
46
+ if fm_match:
47
+ try: fm = yaml.safe_load(fm_match.group(1)) or {}
48
+ except: pass
49
+
50
+ tags = fm.get("tags", [])
51
+ if isinstance(tags, str): tags = [tags]
52
+
53
+ nodes[stem] = {
54
+ "path": str(md.relative_to(HOME)),
55
+ "tags": [str(t).replace("#","") for t in tags],
56
+ "category": md.parent.name,
57
+ "severity": str(fm.get("severity", "medium")),
58
+ }
59
+
60
+ for link in wikilink_re.findall(text):
61
+ target = link.split("/")[-1].split("|")[0].replace(".md", "").strip()
62
+ if target and target != stem:
63
+ edges.append((stem, target))
64
+
65
+ for name, info in nodes.items():
66
+ g.query(
67
+ f"MERGE (n:Doc {{name:'{esc(name)}'}}) "
68
+ f"SET n.path='{esc(info['path'])}', "
69
+ f"n.category='{esc(info['category'])}', "
70
+ f"n.severity='{esc(info['severity'])}', "
71
+ f"n.tags='{esc(','.join(info['tags']))}'"
72
+ )
73
+
74
+ edge_count = 0
75
+ for src_name, tgt_name in edges:
76
+ try:
77
+ g.query(
78
+ f"MATCH (a:Doc {{name:'{esc(src_name)}'}}), (b:Doc {{name:'{esc(tgt_name)}'}}) "
79
+ f"MERGE (a)-[:LINKS_TO]->(b)"
80
+ )
81
+ edge_count += 1
82
+ except: pass
83
+
84
+ all_tags = set()
85
+ for info in nodes.values():
86
+ for t in info["tags"]:
87
+ if t: all_tags.add(t)
88
+ for t in all_tags:
89
+ g.query(f"MERGE (:Tag {{name:'{esc(t)}'}})")
90
+ for name, info in nodes.items():
91
+ for t in info["tags"]:
92
+ if not t: continue
93
+ g.query(
94
+ f"MATCH (d:Doc {{name:'{esc(name)}'}}), (t:Tag {{name:'{esc(t)}'}}) "
95
+ f"MERGE (d)-[:TAGGED]->(t)"
96
+ )
97
+
98
+ print(f"Graph built: {len(nodes)} docs, {edge_count} links, {len(all_tags)} tags")
99
+
100
+ r = g.query("MATCH (d:Doc)-[:TAGGED]->(t:Tag) RETURN t.name, count(d) AS c ORDER BY c DESC LIMIT 10")
101
+ print("\nTop 10 tags:")
102
+ for row in r.result_set: print(f" #{row[0]}: {row[1]} docs")
103
+
104
+ r = g.query("MATCH (d:Doc)-[r:LINKS_TO]-() RETURN d.name, count(r) AS c ORDER BY c DESC LIMIT 10")
105
+ print("\nTop 10 hubs (most connected):")
106
+ for row in r.result_set: print(f" {row[0]}: {row[1]} links")
107
+ PY
bin/groq-bridge.sh ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Groq bridge — fast Llama/Qwen inference via Groq API (OpenAI-compat)
3
+ # Usage: groq-bridge.sh [--model MODEL] "<prompt>" | echo "..." | groq-bridge.sh
4
+ set -u
5
+ # Default: Llama 3.3 70B — best quality on Groq free tier (still ultra-fast).
6
+ # 8B is available as --model fast when latency matters more than quality.
7
+ MODEL="llama-3.3-70b-versatile"
8
+ MAX_TOKENS=2000
9
+ TEMP=0.3
10
+ PROMPT=""
11
+
12
+ while [[ $# -gt 0 ]]; do
13
+ case "$1" in
14
+ --model)
15
+ case "$2" in
16
+ fast|small|8b) MODEL="llama-3.1-8b-instant" ;;
17
+ llama|llama70) MODEL="llama-3.3-70b-versatile" ;;
18
+ qwen) MODEL="qwen/qwen3-32b" ;;
19
+ llama4|scout) MODEL="meta-llama/llama-4-scout-17b-16e-instruct" ;;
20
+ gpt-oss|oss) MODEL="openai/gpt-oss-120b" ;;
21
+ *) MODEL="$2" ;;
22
+ esac; shift 2 ;;
23
+ --max-tokens) MAX_TOKENS="$2"; shift 2 ;;
24
+ *) PROMPT="$*"; break ;;
25
+ esac
26
+ done
27
+ [[ -z "$PROMPT" ]] && [[ ! -t 0 ]] && PROMPT=$(cat)
28
+ [[ -z "$PROMPT" ]] && { echo "groq-bridge: no prompt" >&2; exit 2; }
29
+
30
+ LOG="$HOME/.surrogate/logs/groq-bridge.log"
31
+ mkdir -p "$(dirname "$LOG")"
32
+ set -a; source "$HOME/.hermes/.env"; set +a
33
+ echo "[$(date '+%H:%M:%S')] model=$MODEL len=${#PROMPT}" >> "$LOG"
34
+
35
+ RESPONSE=$(python3 -c "
36
+ import os
37
+ exec(open(os.path.expanduser('~/.surrogate/bin/lib/dns_fallback.py')).read())
38
+ exec(open(os.path.expanduser('~/.surrogate/bin/lib/bridge_retry.py')).read())
39
+ import json, sys
40
+ body = {
41
+ 'model': '$MODEL',
42
+ 'messages': [{'role':'user','content': sys.stdin.read()}],
43
+ 'max_tokens': $MAX_TOKENS, 'temperature': $TEMP,
44
+ }
45
+ try:
46
+ d = request_with_retry(
47
+ 'https://api.groq.com/openai/v1/chat/completions',
48
+ data=json.dumps(body).encode(),
49
+ headers={'Content-Type':'application/json', 'User-Agent':'hermes-agent/1.0', 'Authorization':'Bearer '+os.environ.get('GROQ_API_KEY','')},
50
+ timeout=120, max_retries=4, base_delay=2.0,
51
+ )
52
+ print(d.get('choices',[{}])[0].get('message',{}).get('content',''))
53
+ except Exception as e:
54
+ print(f'groq-bridge error: {e}', file=sys.stderr); sys.exit(1)
55
+ " <<< "$PROMPT")
56
+ RC=$?
57
+ echo "[$(date '+%H:%M:%S')] rc=$RC bytes=${#RESPONSE}" >> "$LOG"
58
+ [[ $RC -ne 0 ]] && exit $RC
59
+ echo "$RESPONSE"
bin/hermes-daily-summary.sh CHANGED
@@ -4,7 +4,7 @@
4
  set -u
5
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
6
 
7
- LOG="$HOME/.claude/logs/hermes-daily-summary.log"
8
  mkdir -p "$(dirname "$LOG")"
9
 
10
  # ── Collect metrics ──────────────────────────────────────────────────────────
@@ -12,23 +12,23 @@ TODAY=$(date +%Y-%m-%d)
12
  YESTERDAY=$(date -v-1d +%Y-%m-%d 2>/dev/null || date -d 'yesterday' +%Y-%m-%d)
13
 
14
  # 1. Tasks completed (24h)
15
- TASKS_DONE=$(grep -c "done in" ~/.claude/logs/hermes-dev-*-daemon.log 2>/dev/null | awk -F: '{s+=$2} END{print s+0}')
16
 
17
  # 2. Tasks failed (24h)
18
- TASKS_FAIL=$(grep -c "failed after" ~/.claude/logs/hermes-dev-*-daemon.log 2>/dev/null | awk -F: '{s+=$2} END{print s+0}')
19
 
20
  # 3. Scrape activity
21
- SCRAPE_TOTAL=$(sqlite3 ~/.claude/state/scrape-ledger.db "SELECT COUNT(*) FROM scraped" 2>/dev/null || echo "?")
22
- SCRAPE_24H=$(sqlite3 ~/.claude/state/scrape-ledger.db "SELECT COUNT(*) FROM scraped WHERE scraped_at > datetime('now','-24 hours')" 2>/dev/null || echo "?")
23
 
24
  # 4. Training pairs
25
  PAIRS=$(wc -l ~/axentx/surrogate/data/training-jsonl/*.jsonl 2>/dev/null | tail -1 | awk '{print $1}' || echo "?")
26
 
27
  # 5. Index docs
28
- DOCS=$(sqlite3 ~/.claude/index.db "SELECT COUNT(*) FROM docs" 2>/dev/null || echo "?")
29
 
30
  # 6. Episodes (surrogate memory)
31
- EPISODES=$(wc -l ~/.claude/state/surrogate-memory/episodes.jsonl 2>/dev/null | awk '{print $1}' || echo 0)
32
 
33
  # 7. Daemons running
34
  DAEMONS_UP=$(pgrep -f "dev-cloud-daemon\|qwen-coder-daemon\|priority-json-watcher\|hermes" 2>/dev/null | wc -l | tr -d ' ')
@@ -41,7 +41,7 @@ for q in cerebras groq github samba nvidia cloudflare qwen-local; do
41
  done
42
 
43
  # 9. Recent errors (last 100 log lines)
44
- ERR_COUNT=$(tail -200 ~/.claude/logs/*.log 2>/dev/null | grep -cE "ERROR|CRITICAL|Fatal|429|500" 2>/dev/null || echo 0)
45
 
46
  # ── Build digest body ────────────────────────────────────────────────────────
47
  BODY="$(cat <<EOF
@@ -62,4 +62,4 @@ LEVEL="info"
62
 
63
  echo "[$(date '+%H:%M:%S')] sending daily summary (${LEVEL}): done=$TASKS_DONE fail=$TASKS_FAIL scrape=$SCRAPE_24H" >> "$LOG"
64
 
65
- "$HOME/.claude/bin/notify-discord.sh" "$LEVEL" "Hermes daily summary · $TODAY" "$BODY"
 
4
  set -u
5
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
6
 
7
+ LOG="$HOME/.surrogate/logs/hermes-daily-summary.log"
8
  mkdir -p "$(dirname "$LOG")"
9
 
10
  # ── Collect metrics ──────────────────────────────────────────────────────────
 
12
  YESTERDAY=$(date -v-1d +%Y-%m-%d 2>/dev/null || date -d 'yesterday' +%Y-%m-%d)
13
 
14
  # 1. Tasks completed (24h)
15
+ TASKS_DONE=$(grep -c "done in" ~/.surrogate/logs/hermes-dev-*-daemon.log 2>/dev/null | awk -F: '{s+=$2} END{print s+0}')
16
 
17
  # 2. Tasks failed (24h)
18
+ TASKS_FAIL=$(grep -c "failed after" ~/.surrogate/logs/hermes-dev-*-daemon.log 2>/dev/null | awk -F: '{s+=$2} END{print s+0}')
19
 
20
  # 3. Scrape activity
21
+ SCRAPE_TOTAL=$(sqlite3 ~/.surrogate/state/scrape-ledger.db "SELECT COUNT(*) FROM scraped" 2>/dev/null || echo "?")
22
+ SCRAPE_24H=$(sqlite3 ~/.surrogate/state/scrape-ledger.db "SELECT COUNT(*) FROM scraped WHERE scraped_at > datetime('now','-24 hours')" 2>/dev/null || echo "?")
23
 
24
  # 4. Training pairs
25
  PAIRS=$(wc -l ~/axentx/surrogate/data/training-jsonl/*.jsonl 2>/dev/null | tail -1 | awk '{print $1}' || echo "?")
26
 
27
  # 5. Index docs
28
+ DOCS=$(sqlite3 ~/.surrogate/index.db "SELECT COUNT(*) FROM docs" 2>/dev/null || echo "?")
29
 
30
  # 6. Episodes (surrogate memory)
31
+ EPISODES=$(wc -l ~/.surrogate/state/surrogate-memory/episodes.jsonl 2>/dev/null | awk '{print $1}' || echo 0)
32
 
33
  # 7. Daemons running
34
  DAEMONS_UP=$(pgrep -f "dev-cloud-daemon\|qwen-coder-daemon\|priority-json-watcher\|hermes" 2>/dev/null | wc -l | tr -d ' ')
 
41
  done
42
 
43
  # 9. Recent errors (last 100 log lines)
44
+ ERR_COUNT=$(tail -200 ~/.surrogate/logs/*.log 2>/dev/null | grep -cE "ERROR|CRITICAL|Fatal|429|500" 2>/dev/null || echo 0)
45
 
46
  # ── Build digest body ────────────────────────────────────────────────────────
47
  BODY="$(cat <<EOF
 
62
 
63
  echo "[$(date '+%H:%M:%S')] sending daily summary (${LEVEL}): done=$TASKS_DONE fail=$TASKS_FAIL scrape=$SCRAPE_24H" >> "$LOG"
64
 
65
+ "$HOME/.surrogate/bin/notify-discord.sh" "$LEVEL" "Hermes daily summary · $TODAY" "$BODY"
bin/hermes-discord-bot.py CHANGED
@@ -10,7 +10,7 @@ Triggers (responds when):
10
  Pipes user message → surrogate -p "..." → replies with output.
11
 
12
  Token comes from $DISCORD_BOT_TOKEN (read from ~/.hermes/.env).
13
- Logs to ~/.claude/logs/hermes-discord-bot.log.
14
  """
15
  from __future__ import annotations
16
 
@@ -26,12 +26,12 @@ import discord
26
 
27
  # ── Config ───────────────────────────────────────────────────────────────────
28
  HOME = Path.home()
29
- LOG_PATH = HOME / ".claude/logs/hermes-discord-bot.log"
30
  LOG_PATH.parent.mkdir(parents=True, exist_ok=True)
31
 
32
- # surrogate CLI path: prefer ~/.local/bin (installed), fallback ~/.claude/bin
33
  SURROGATE_BIN = next(
34
- p for p in [HOME / ".local/bin/surrogate", HOME / ".claude/bin/surrogate"] if p.exists()
35
  )
36
 
37
  PREFIX_RE = re.compile(r"^[!/]sg\b\s*", re.IGNORECASE)
@@ -169,7 +169,7 @@ async def on_ready() -> None:
169
  log.info("connected as %s (id=%s)", client.user, client.user.id if client.user else "?")
170
  print(f"✅ logged in as {client.user}")
171
  # Notify Discord channel via webhook that bot came online
172
- notify = HOME / ".claude/bin/notify-discord.sh"
173
  if notify.exists():
174
  subprocess.Popen(
175
  [str(notify), "success", "Discord bot online", f"Connected as {client.user}. DM or @mention to chat."],
 
10
  Pipes user message → surrogate -p "..." → replies with output.
11
 
12
  Token comes from $DISCORD_BOT_TOKEN (read from ~/.hermes/.env).
13
+ Logs to ~/.surrogate/logs/hermes-discord-bot.log.
14
  """
15
  from __future__ import annotations
16
 
 
26
 
27
  # ── Config ───────────────────────────────────────────────────────────────────
28
  HOME = Path.home()
29
+ LOG_PATH = HOME / ".surrogate/logs/hermes-discord-bot.log"
30
  LOG_PATH.parent.mkdir(parents=True, exist_ok=True)
31
 
32
+ # surrogate CLI path: prefer ~/.local/bin (installed), fallback ~/.surrogate/bin
33
  SURROGATE_BIN = next(
34
+ p for p in [HOME / ".local/bin/surrogate", HOME / ".surrogate/bin/surrogate"] if p.exists()
35
  )
36
 
37
  PREFIX_RE = re.compile(r"^[!/]sg\b\s*", re.IGNORECASE)
 
169
  log.info("connected as %s (id=%s)", client.user, client.user.id if client.user else "?")
170
  print(f"✅ logged in as {client.user}")
171
  # Notify Discord channel via webhook that bot came online
172
+ notify = HOME / ".surrogate/bin/notify-discord.sh"
173
  if notify.exists():
174
  subprocess.Popen(
175
  [str(notify), "success", "Discord bot online", f"Connected as {client.user}. DM or @mention to chat."],
bin/hermes-status-server.py CHANGED
@@ -26,9 +26,9 @@ from pydantic import BaseModel
26
  app = FastAPI(title="hermes", docs_url=None, redoc_url=None)
27
 
28
  HOME = Path(os.environ.get("HOME", "/home/hermes"))
29
- LEDGER = HOME / ".claude/state/scrape-ledger.db"
30
- EPISODES = HOME / ".claude/state/surrogate-memory/episodes.jsonl"
31
- LOG_DIR = HOME / ".claude/logs"
32
 
33
 
34
  def _ledger_count() -> int:
@@ -92,7 +92,7 @@ async def chat(req: ChatRequest) -> JSONResponse:
92
  if not req.prompt.strip():
93
  raise HTTPException(status_code=400, detail="prompt is empty")
94
 
95
- surrogate_bin = HOME / ".claude/bin/surrogate"
96
  if not surrogate_bin.exists():
97
  raise HTTPException(status_code=503, detail="surrogate CLI not installed in container")
98
 
 
26
  app = FastAPI(title="hermes", docs_url=None, redoc_url=None)
27
 
28
  HOME = Path(os.environ.get("HOME", "/home/hermes"))
29
+ LEDGER = HOME / ".surrogate/state/scrape-ledger.db"
30
+ EPISODES = HOME / ".surrogate/state/surrogate-memory/episodes.jsonl"
31
+ LOG_DIR = HOME / ".surrogate/logs"
32
 
33
 
34
  def _ledger_count() -> int:
 
92
  if not req.prompt.strip():
93
  raise HTTPException(status_code=400, detail="prompt is empty")
94
 
95
+ surrogate_bin = HOME / ".surrogate/bin/surrogate"
96
  if not surrogate_bin.exists():
97
  raise HTTPException(status_code=503, detail="surrogate CLI not installed in container")
98
 
bin/lib/__init__.py ADDED
File without changes
bin/lib/bridge_retry.py ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Shared HTTP retry library for all cloud bridges.
2
+ Handles: exponential backoff + jitter + Retry-After + circuit breaker.
3
+ Import at top of any bridge: exec(open(...).read())
4
+
5
+ Exports: request_with_retry(url, data, headers, max_retries=4, base_delay=2.0)
6
+ """
7
+ import json as _json
8
+ import os as _os
9
+ import random as _random
10
+ import time as _time
11
+ import urllib.request as _urlreq
12
+ import urllib.error as _urlerr
13
+
14
+ # Circuit breaker state — persisted in /tmp so all bridge invocations share
15
+ _CB_DIR = "/tmp/bridge-circuits"
16
+ _os.makedirs(_CB_DIR, exist_ok=True)
17
+
18
+
19
+ def _cb_state_path(host):
20
+ return f"{_CB_DIR}/{host.replace('/', '_')}.json"
21
+
22
+
23
+ def _circuit_open(host):
24
+ p = _cb_state_path(host)
25
+ try:
26
+ with open(p) as f:
27
+ s = _json.load(f)
28
+ # Circuit closed after timeout
29
+ if _time.time() > s.get("open_until", 0):
30
+ return False, 0
31
+ return True, int(s["open_until"] - _time.time())
32
+ except Exception:
33
+ return False, 0
34
+
35
+
36
+ def _record_failure(host, open_seconds=60):
37
+ """Called on 429 or 5xx — track consecutive failures."""
38
+ p = _cb_state_path(host)
39
+ try:
40
+ with open(p) as f:
41
+ s = _json.load(f)
42
+ except Exception:
43
+ s = {"consec_fails": 0, "open_until": 0}
44
+ s["consec_fails"] = s.get("consec_fails", 0) + 1
45
+ # Open circuit after 3 consecutive failures
46
+ if s["consec_fails"] >= 3:
47
+ s["open_until"] = _time.time() + open_seconds
48
+ with open(p, "w") as f:
49
+ _json.dump(s, f)
50
+
51
+
52
+ def _record_success(host):
53
+ """Called on 2xx — reset failure counter."""
54
+ p = _cb_state_path(host)
55
+ try:
56
+ with open(p, "w") as f:
57
+ _json.dump({"consec_fails": 0, "open_until": 0}, f)
58
+ except Exception:
59
+ pass
60
+
61
+
62
+ def _parse_retry_after(headers, default_delay):
63
+ """Honor Retry-After header (seconds) or x-ratelimit-reset-after."""
64
+ for h in ("Retry-After", "retry-after", "x-ratelimit-reset-after", "x-ratelimit-reset"):
65
+ val = headers.get(h)
66
+ if val:
67
+ try:
68
+ n = int(val)
69
+ # x-ratelimit-reset may be absolute epoch — convert to delta
70
+ if n > 10_000_000_000: # way in future = epoch ms
71
+ n = n // 1000 - int(_time.time())
72
+ elif n > 1_000_000_000: # epoch seconds
73
+ n = n - int(_time.time())
74
+ return max(1, min(n, 300)) # clamp 1..300s
75
+ except (ValueError, TypeError):
76
+ pass
77
+ return default_delay
78
+
79
+
80
+ def request_with_retry(url, data, headers, timeout=120, max_retries=4, base_delay=2.0, open_seconds=60):
81
+ """Make HTTP request with exp-backoff retry + circuit breaker.
82
+
83
+ Args:
84
+ open_seconds: how long to open circuit after 3 consecutive failures.
85
+ Default 60s. Callers with strict per-minute rate limits (Cloudflare,
86
+ SambaNova) should use 120-180s so we don't hammer during cooldown.
87
+
88
+ Returns: parsed JSON response.
89
+ Raises: Exception if circuit open or max retries exhausted.
90
+ """
91
+ from urllib.parse import urlparse
92
+
93
+ host = urlparse(url).netloc
94
+
95
+ # Circuit breaker check
96
+ is_open, remaining = _circuit_open(host)
97
+ if is_open:
98
+ raise Exception(f"circuit-open for {host} ({remaining}s remaining)")
99
+
100
+ last_err = None
101
+ for attempt in range(max_retries):
102
+ try:
103
+ req = _urlreq.Request(url, data=data, headers=headers)
104
+ with _urlreq.urlopen(req, timeout=timeout) as r:
105
+ result = _json.load(r)
106
+ _record_success(host)
107
+ return result
108
+ except _urlerr.HTTPError as e:
109
+ last_err = e
110
+ if e.code == 429:
111
+ # Rate-limited — honor Retry-After
112
+ base = base_delay * (2 ** attempt)
113
+ delay = _parse_retry_after(e.headers, base)
114
+ delay *= (1 + _random.uniform(-0.2, 0.2)) # jitter ±20%
115
+ if attempt < max_retries - 1:
116
+ _time.sleep(min(delay, 60))
117
+ continue
118
+ _record_failure(host, open_seconds=open_seconds)
119
+ raise Exception(f"HTTP 429 after {max_retries} retries (last Retry-After: {delay:.0f}s)")
120
+ elif 500 <= e.code < 600:
121
+ # Server error — exp backoff with jitter
122
+ delay = base_delay * (2 ** attempt) * (1 + _random.uniform(-0.2, 0.2))
123
+ if attempt < max_retries - 1:
124
+ _time.sleep(min(delay, 30))
125
+ continue
126
+ _record_failure(host, open_seconds=open_seconds)
127
+ raise Exception(f"HTTP {e.code} after {max_retries} retries")
128
+ else:
129
+ # 4xx other than 429 — not retryable (client error)
130
+ _record_failure(host, open_seconds=open_seconds)
131
+ raise
132
+ except (_urlerr.URLError, _os.error) as e:
133
+ last_err = e
134
+ # Network error — retry with backoff
135
+ delay = base_delay * (2 ** attempt) * (1 + _random.uniform(-0.2, 0.2))
136
+ if attempt < max_retries - 1:
137
+ _time.sleep(min(delay, 30))
138
+ continue
139
+ _record_failure(host, open_seconds=open_seconds)
140
+ raise
141
+
142
+ raise Exception(f"max retries ({max_retries}) exhausted: {last_err}")
bin/lib/checkpoint.py ADDED
@@ -0,0 +1,146 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Checkpoint store — JSONL event log per task, append-only.
2
+
3
+ Purpose:
4
+ - Crash-safe: every event appended immediately (no buffering)
5
+ - Resume-aware: load full event trail to reconstruct task state
6
+ - Distill-friendly: each file = complete conversation trace a future model can learn from
7
+
8
+ Event types:
9
+ task_start, codebase_review, provider_selected, stream_chunk, model_switch,
10
+ result_draft, review_requested, review_verdict, revision_requested, task_done,
11
+ task_failed, provider_probe
12
+
13
+ File layout:
14
+ ~/.surrogate/yolo/checkpoints/<task-id>.jsonl — live tasks
15
+ ~/.surrogate/yolo/checkpoints_done/<task-id>.jsonl — completed (archive)
16
+ """
17
+
18
+ from __future__ import annotations
19
+
20
+ import datetime as dt
21
+ import json
22
+ from dataclasses import dataclass
23
+ from pathlib import Path
24
+ from typing import Any, Iterator
25
+
26
+ CHECKPOINT_DIR = Path.home() / ".surrogate" / "yolo" / "checkpoints"
27
+ CHECKPOINT_DONE = Path.home() / ".surrogate" / "yolo" / "checkpoints_done"
28
+
29
+
30
+ def _now() -> str:
31
+ return dt.datetime.now(dt.timezone.utc).isoformat()
32
+
33
+
34
+ @dataclass
35
+ class Checkpoint:
36
+ task_id: str
37
+ path: Path
38
+
39
+ @classmethod
40
+ def open(cls, task_id: str) -> "Checkpoint":
41
+ CHECKPOINT_DIR.mkdir(parents=True, exist_ok=True)
42
+ return cls(task_id=task_id, path=CHECKPOINT_DIR / f"{task_id}.jsonl")
43
+
44
+ def append(self, event_type: str, **fields: Any) -> None:
45
+ """Atomically append event. Fields serialize via JSON."""
46
+ rec = {"t": _now(), "event": event_type, **fields}
47
+ with open(self.path, "a") as f:
48
+ f.write(json.dumps(rec, ensure_ascii=False, default=str) + "\n")
49
+
50
+ def events(self) -> list[dict]:
51
+ if not self.path.exists():
52
+ return []
53
+ out = []
54
+ with open(self.path) as f:
55
+ for line in f:
56
+ line = line.strip()
57
+ if not line:
58
+ continue
59
+ try:
60
+ out.append(json.loads(line))
61
+ except json.JSONDecodeError:
62
+ continue
63
+ return out
64
+
65
+ def last_event(self, event_type: str = "") -> dict | None:
66
+ for e in reversed(self.events()):
67
+ if not event_type or e.get("event") == event_type:
68
+ return e
69
+ return None
70
+
71
+ def resume_state(self) -> dict:
72
+ """Reconstruct what we know from the event trail.
73
+
74
+ Returns:
75
+ {
76
+ "started": bool,
77
+ "completed": bool,
78
+ "failed": bool,
79
+ "current_model": str | None,
80
+ "draft_text": str (partial output so far),
81
+ "attempts": int,
82
+ "last_event": dict | None,
83
+ "artifacts_reviewed": list[str],
84
+ "review_iterations": int,
85
+ }
86
+ """
87
+ ev = self.events()
88
+ state = {
89
+ "started": False,
90
+ "completed": False,
91
+ "failed": False,
92
+ "current_model": None,
93
+ "draft_text": "",
94
+ "attempts": 0,
95
+ "last_event": ev[-1] if ev else None,
96
+ "artifacts_reviewed": [],
97
+ "review_iterations": 0,
98
+ }
99
+ for e in ev:
100
+ etype = e.get("event")
101
+ if etype == "task_start":
102
+ state["started"] = True
103
+ elif etype == "provider_selected":
104
+ state["current_model"] = e.get("model")
105
+ state["attempts"] += 1
106
+ elif etype == "model_switch":
107
+ state["current_model"] = e.get("to")
108
+ elif etype == "codebase_review":
109
+ state["artifacts_reviewed"] = e.get("artifacts", [])
110
+ elif etype == "result_draft":
111
+ state["draft_text"] = e.get("text", state["draft_text"])
112
+ elif etype == "review_verdict":
113
+ state["review_iterations"] += 1
114
+ elif etype == "task_done":
115
+ state["completed"] = True
116
+ elif etype == "task_failed":
117
+ state["failed"] = True
118
+ return state
119
+
120
+ def archive(self) -> None:
121
+ """Move to checkpoints_done/ after task complete."""
122
+ CHECKPOINT_DONE.mkdir(parents=True, exist_ok=True)
123
+ dest = CHECKPOINT_DONE / self.path.name
124
+ if self.path.exists():
125
+ self.path.rename(dest)
126
+ self.path = dest
127
+
128
+
129
+ def list_active() -> list[str]:
130
+ if not CHECKPOINT_DIR.exists():
131
+ return []
132
+ return [p.stem for p in CHECKPOINT_DIR.glob("*.jsonl")]
133
+
134
+
135
+ if __name__ == "__main__":
136
+ import sys
137
+ if len(sys.argv) < 2:
138
+ print("usage: checkpoint.py <task-id> [replay]")
139
+ sys.exit(1)
140
+ cp = Checkpoint.open(sys.argv[1])
141
+ if len(sys.argv) > 2 and sys.argv[2] == "replay":
142
+ for e in cp.events():
143
+ print(json.dumps(e, ensure_ascii=False))
144
+ else:
145
+ state = cp.resume_state()
146
+ print(json.dumps(state, indent=2, ensure_ascii=False, default=str))
bin/lib/codebase_scanner.py ADDED
@@ -0,0 +1,225 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Codebase scanner — full review before each task iteration.
2
+
3
+ Purpose (per Ashira): full scan first, then grep context that previous iteration
4
+ left behind. "Review agent" relies on this to know what was done vs what remains.
5
+
6
+ 3-pass strategy:
7
+ Pass 1: List recently-modified files across watched roots (last 7 days)
8
+ Pass 2: Semantic search via ChromaDB (if index exists) using task keywords
9
+ Pass 3: Git status + diff for any repos found (to detect uncommitted work)
10
+
11
+ Input: task description (string)
12
+ Output: structured summary dict the dispatcher can feed to models as context
13
+ """
14
+
15
+ from __future__ import annotations
16
+
17
+ import datetime as dt
18
+ import json
19
+ import os
20
+ import re
21
+ import subprocess
22
+ from pathlib import Path
23
+
24
+ HOME = Path.home()
25
+ WATCHED_ROOTS = [
26
+ HOME / "develope",
27
+ HOME / "axentx",
28
+ HOME / ".surrogate" / "bin",
29
+ ]
30
+ RECENT_DAYS = 7
31
+ MAX_FILE_SIZE = 100_000 # skip large binaries
32
+ MAX_FILES_PASS1 = 50
33
+ MAX_CHUNKS_PASS2 = 10
34
+ CHROMA_DB = HOME / ".surrogate" / "code-vector-db"
35
+
36
+
37
+ def _keywords(task: str) -> list[str]:
38
+ tokens = re.findall(r"[A-Za-z_][A-Za-z0-9_]*", task.lower())
39
+ stop = {"a", "an", "the", "is", "are", "was", "were", "be", "to", "and",
40
+ "or", "but", "if", "then", "else", "for", "with", "of", "in", "on",
41
+ "at", "this", "that", "from", "by", "as", "i", "you", "it", "we",
42
+ "they", "write", "create", "make", "build", "add", "update", "task"}
43
+ return [t for t in tokens if len(t) >= 3 and t not in stop][:10]
44
+
45
+
46
+ def _recent_files(keywords: list[str], roots: list[Path]) -> list[dict]:
47
+ """Find recently modified source files matching keywords."""
48
+ cutoff = dt.datetime.now() - dt.timedelta(days=RECENT_DAYS)
49
+ out = []
50
+ for root in roots:
51
+ if not root.exists():
52
+ continue
53
+ for dirpath, dirnames, filenames in os.walk(root):
54
+ # skip hidden, node_modules, .git, venv
55
+ dirnames[:] = [d for d in dirnames if not d.startswith(".")
56
+ and d not in {"node_modules", "vendor", "venv", ".venv",
57
+ "__pycache__", "dist", "build", "target"}]
58
+ for f in filenames:
59
+ p = Path(dirpath) / f
60
+ try:
61
+ st = p.stat()
62
+ except OSError:
63
+ continue
64
+ if st.st_size > MAX_FILE_SIZE:
65
+ continue
66
+ mtime = dt.datetime.fromtimestamp(st.st_mtime)
67
+ if mtime < cutoff:
68
+ continue
69
+ # score by keyword hits in name/path
70
+ path_lower = str(p).lower()
71
+ score = sum(1 for kw in keywords if kw in path_lower)
72
+ # light content match (first 4KB only for perf)
73
+ try:
74
+ with open(p, "r", errors="replace") as fh:
75
+ head = fh.read(4096).lower()
76
+ score += sum(1 for kw in keywords if kw in head) * 2
77
+ except OSError:
78
+ continue
79
+ if score > 0:
80
+ out.append({
81
+ "path": str(p),
82
+ "mtime": mtime.isoformat(),
83
+ "score": score,
84
+ "size": st.st_size,
85
+ })
86
+ out.sort(key=lambda x: -x["score"])
87
+ return out[:MAX_FILES_PASS1]
88
+
89
+
90
+ def _chromadb_search(keywords: list[str], task: str) -> list[dict]:
91
+ """Query ChromaDB semantic index (if available)."""
92
+ if not CHROMA_DB.exists():
93
+ return []
94
+ try:
95
+ # Use existing helper if present
96
+ helper = HOME / ".surrogate" / "bin" / "code-search.sh"
97
+ if helper.exists():
98
+ proc = subprocess.run(
99
+ [str(helper), "--top", str(MAX_CHUNKS_PASS2), task],
100
+ capture_output=True, text=True, timeout=30,
101
+ )
102
+ if proc.returncode == 0 and proc.stdout:
103
+ out = []
104
+ for line in proc.stdout.splitlines()[:MAX_CHUNKS_PASS2]:
105
+ m = re.match(r"(\S+):(\d+)\s+(.*)", line)
106
+ if m:
107
+ out.append({
108
+ "path": m.group(1),
109
+ "line": int(m.group(2)),
110
+ "preview": m.group(3)[:200],
111
+ })
112
+ return out
113
+ except (subprocess.TimeoutExpired, OSError):
114
+ pass
115
+ return []
116
+
117
+
118
+ def _git_uncommitted(roots: list[Path]) -> list[dict]:
119
+ """Detect repos with uncommitted work (partial iterations)."""
120
+ out = []
121
+ # Find up to 3 levels of git repos
122
+ for root in roots:
123
+ if not root.exists():
124
+ continue
125
+ for depth_glob in ["*/.git", "*/*/.git", "*/*/*/.git"]:
126
+ for git_dir in root.glob(depth_glob):
127
+ repo = git_dir.parent
128
+ try:
129
+ status = subprocess.run(
130
+ ["git", "-C", str(repo), "status", "--short"],
131
+ capture_output=True, text=True, timeout=5,
132
+ )
133
+ if status.returncode == 0 and status.stdout.strip():
134
+ out.append({
135
+ "repo": str(repo),
136
+ "changes": status.stdout.strip().splitlines()[:20],
137
+ })
138
+ except (subprocess.TimeoutExpired, OSError):
139
+ continue
140
+ return out
141
+
142
+
143
+ def scan(task: str, task_artifacts: list[str] | None = None) -> dict:
144
+ """Full codebase review → structured context dict.
145
+
146
+ Args:
147
+ task: natural-language task description
148
+ task_artifacts: paths mentioned in task (will be loaded in full)
149
+
150
+ Returns:
151
+ {
152
+ "keywords": [...],
153
+ "recent_files": [{path, mtime, score, size}, ...],
154
+ "semantic_hits": [{path, line, preview}, ...],
155
+ "uncommitted_repos": [{repo, changes: [...]}, ...],
156
+ "explicit_artifacts": {path: content, ...}, # loaded in full
157
+ }
158
+ """
159
+ keywords = _keywords(task)
160
+ report = {
161
+ "task_excerpt": task[:200],
162
+ "keywords": keywords,
163
+ "recent_files": _recent_files(keywords, WATCHED_ROOTS),
164
+ "semantic_hits": _chromadb_search(keywords, task),
165
+ "uncommitted_repos": _git_uncommitted(WATCHED_ROOTS),
166
+ "explicit_artifacts": {},
167
+ }
168
+ for a in task_artifacts or []:
169
+ p = Path(a)
170
+ if p.exists() and p.is_file() and p.stat().st_size < MAX_FILE_SIZE:
171
+ try:
172
+ report["explicit_artifacts"][str(p)] = p.read_text(errors="replace")[:10000]
173
+ except OSError:
174
+ pass
175
+ return report
176
+
177
+
178
+ def as_context_prompt(scan_result: dict, max_chars: int = 8000) -> str:
179
+ """Render scan as context for LLM system prompt."""
180
+ lines = [
181
+ "## Codebase context (auto-generated)",
182
+ f"Task keywords: {', '.join(scan_result['keywords'])}",
183
+ "",
184
+ ]
185
+ if scan_result["uncommitted_repos"]:
186
+ lines.append("### Uncommitted work (may indicate previous partial iteration):")
187
+ for r in scan_result["uncommitted_repos"][:5]:
188
+ lines.append(f" {r['repo']}")
189
+ for c in r["changes"][:8]:
190
+ lines.append(f" {c}")
191
+ lines.append("")
192
+
193
+ if scan_result["recent_files"]:
194
+ lines.append(f"### Recently modified relevant files ({len(scan_result['recent_files'])}):")
195
+ for f in scan_result["recent_files"][:15]:
196
+ lines.append(f" {f['path']} (score={f['score']}, mtime={f['mtime']})")
197
+ lines.append("")
198
+
199
+ if scan_result["semantic_hits"]:
200
+ lines.append("### Semantic search hits:")
201
+ for h in scan_result["semantic_hits"][:8]:
202
+ lines.append(f" {h['path']}:{h.get('line','?')} — {h['preview'][:120]}")
203
+ lines.append("")
204
+
205
+ if scan_result["explicit_artifacts"]:
206
+ lines.append("### Explicit task artifacts (FULL content):")
207
+ for path, content in scan_result["explicit_artifacts"].items():
208
+ lines.append(f"--- {path} ---")
209
+ lines.append(content[:3000])
210
+ lines.append("")
211
+
212
+ result = "\n".join(lines)
213
+ return result[:max_chars]
214
+
215
+
216
+ if __name__ == "__main__":
217
+ import sys
218
+ task = " ".join(sys.argv[1:]) or "refactor yolo daemon"
219
+ report = scan(task)
220
+ print(json.dumps(
221
+ {k: v if not isinstance(v, list) else v[:5] for k, v in report.items()},
222
+ indent=2, default=str, ensure_ascii=False
223
+ ))
224
+ print("\n=== AS CONTEXT PROMPT ===\n")
225
+ print(as_context_prompt(report, 3000))
bin/lib/context_builder.sh ADDED
@@ -0,0 +1,272 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Shared context builder — sourced by qwen-coder-worker + dev-cloud-worker.
3
+ # Produces rich context: repo-map + similar functions from project + past accepted examples.
4
+ # Call: build_rich_context <project> <priority_id> <priority_title>
5
+ # Sets env vars: REPO_MAP, SIMILAR_FUNCS, FEWSHOT_ACCEPTED, ANTI_PATTERNS
6
+ build_rich_context() {
7
+ local PRIO_PROJECT="$1"
8
+ local PRIO_ID="$2"
9
+ local PRIO_TITLE="$3"
10
+ local SHARED="$HOME/.hermes/workspace/swarm-shared"
11
+ local PROJECT_DIR="$HOME/axentx/$PRIO_PROJECT"
12
+
13
+ # 1. Full repo-map (up to 10KB — was 3KB).
14
+ # build-repo-map.sh writes to "<proj>_map.md"; some older paths used "<proj>.md".
15
+ # Try both so we don't silently lose the strongest grounding signal.
16
+ REPO_MAP=""
17
+ for candidate in "$SHARED/repo-maps/${PRIO_PROJECT}_map.md" "$SHARED/repo-maps/${PRIO_PROJECT}.md"; do
18
+ if [[ -f "$candidate" ]]; then
19
+ REPO_MAP=$(/usr/bin/head -c 10000 "$candidate")
20
+ break
21
+ fi
22
+ done
23
+
24
+ # 2. Similar function signatures from project (grep in real codebase)
25
+ SIMILAR_FUNCS=""
26
+ if [[ -d "$PROJECT_DIR" ]]; then
27
+ # Extract keywords from title for grep
28
+ local KW=$(echo "$PRIO_TITLE" | /usr/bin/tr '[:upper:]' '[:lower:]' | /usr/bin/tr -cs 'a-z0-9' ' ' | /usr/bin/tr ' ' '\n' | /usr/bin/awk 'length>4' | /usr/bin/head -3 | /usr/bin/tr '\n' '|' | /usr/bin/sed 's/|$//')
29
+ if [[ -n "$KW" ]]; then
30
+ SIMILAR_FUNCS=$(/usr/bin/find "$PROJECT_DIR" -type f \( -name '*.py' -o -name '*.ts' -o -name '*.tsx' -o -name '*.js' -o -name '*.go' \) ! -path '*/node_modules/*' ! -path '*/.hermes-*' 2>/dev/null | \
31
+ xargs /usr/bin/grep -lE "($KW)" 2>/dev/null | /usr/bin/head -3 | while read f; do
32
+ echo "=== ${f#$PROJECT_DIR/} ==="
33
+ /usr/bin/grep -A3 -E "^(def|function|export const|class|async def|interface)" "$f" 2>/dev/null | /usr/bin/head -30
34
+ done 2>/dev/null | /usr/bin/head -c 4000)
35
+ fi
36
+ fi
37
+
38
+ # 3. RAG: actual code patterns from project (SQLite FTS via ask-sqlite.py if exists)
39
+ RAG_EXAMPLES=""
40
+ if [[ -x "$HOME/.surrogate/bin/ask-sqlite.py" ]]; then
41
+ RAG_EXAMPLES=$(/usr/bin/python3 "$HOME/.surrogate/bin/ask-sqlite.py" \
42
+ "$PRIO_PROJECT $PRIO_TITLE" 2>/dev/null | /usr/bin/head -c 3000)
43
+ fi
44
+
45
+ # 4. Semantic RAG (from embeddings) — top-5 similar
46
+ SEMANTIC_RAG=""
47
+ if [[ -f "$HOME/.surrogate/embeddings.db" ]]; then
48
+ SEMANTIC_RAG=$(/usr/bin/python3 "$HOME/.surrogate/bin/embed-doc.py" --query "$PRIO_TITLE" 2>/dev/null | /usr/bin/head -c 2000)
49
+ fi
50
+
51
+ # 5. Past ACCEPTED examples (few-shot from quality≥7 history)
52
+ FEWSHOT_ACCEPTED=""
53
+ for review in $(/bin/ls -t "$HOME/.hermes/workspace/qwen-coder-reviews/"*.review.json 2>/dev/null | /usr/bin/head -30); do
54
+ if /usr/bin/grep -qE '"quality_score":\s*[789]|"quality_score":\s*10' "$review" 2>/dev/null; then
55
+ local OUT_FILE=$(basename "$review" .review.json)
56
+ # Search all worker output dirs
57
+ for WD in qwen-coder dev-cloud-samba dev-cloud-github dev-cloud-cloudflare dev-cloud-groq dev-cloud-synthesis; do
58
+ local OUT_PATH="$HOME/.hermes/workspace/$WD/${OUT_FILE}.md"
59
+ if [[ -f "$OUT_PATH" ]]; then
60
+ FEWSHOT_ACCEPTED=$(/usr/bin/head -c 2000 "$OUT_PATH")
61
+ break 2
62
+ fi
63
+ done
64
+ fi
65
+ done
66
+
67
+ # 6. Anti-patterns (last 5 rejection reasons across all workers)
68
+ ANTI_PATTERNS=""
69
+ for review in $(/bin/ls -t "$HOME/.hermes/workspace/qwen-coder-reviews/"*.review.json 2>/dev/null | /usr/bin/head -10); do
70
+ local bugs=$(/usr/bin/python3 -c "
71
+ import json, re, sys
72
+ try:
73
+ txt = open('$review').read()
74
+ m = re.search(r'\{.*\}', txt, re.DOTALL)
75
+ if not m: sys.exit()
76
+ d = json.loads(m.group(0))
77
+ if d.get('verdict') in ('reject','rework') and d.get('bugs'):
78
+ for b in d['bugs'][:2]:
79
+ print(f'- {b[:180]}')
80
+ except: pass
81
+ " 2>/dev/null)
82
+ [[ -n "$bugs" ]] && ANTI_PATTERNS="$ANTI_PATTERNS$bugs"$'\n'
83
+ done
84
+ ANTI_PATTERNS=$(echo "$ANTI_PATTERNS" | /usr/bin/head -10)
85
+
86
+ # 7. Active-learning prompt deltas — aggregate last 5 UNIQUE anti-patterns.
87
+ # Preference: same-project anti-patterns first, then generic.
88
+ # Dedup by first 80 chars of prompt_addition (similar bugs shouldn't bloat prompt).
89
+ PROMPT_DELTAS=""
90
+ if [[ -f "$HOME/.surrogate/memory/worker-prompt-deltas.jsonl" ]]; then
91
+ PROMPT_DELTAS=$(/usr/bin/python3 -c "
92
+ import json, sys
93
+ from pathlib import Path
94
+ try:
95
+ entries = []
96
+ for l in Path('$HOME/.surrogate/memory/worker-prompt-deltas.jsonl').read_text().splitlines():
97
+ if not l.strip(): continue
98
+ try: entries.append(json.loads(l))
99
+ except: pass
100
+ # Dedup by first 80 chars
101
+ seen = set()
102
+ picked = []
103
+ # Walk newest → oldest, cap 5 unique
104
+ for e in reversed(entries):
105
+ addn = (e.get('prompt_addition') or '').strip()
106
+ if not addn: continue
107
+ key = addn[:80]
108
+ if key in seen: continue
109
+ seen.add(key)
110
+ picked.append(addn)
111
+ if len(picked) >= 5: break
112
+ if picked:
113
+ out = ['ACTIVE-LEARNED RULES (avoid these past mistakes):']
114
+ for i, a in enumerate(picked, 1):
115
+ out.append(f'{i}. {a[:400]}')
116
+ print('\n'.join(out))
117
+ except Exception as e: pass
118
+ " 2>/dev/null)
119
+ fi
120
+
121
+ # 8. Priority full spec (if a detailed spec file exists)
122
+ # Spec is the single most important signal — cap high (6KB) so the full
123
+ # Context/Requirements/DO NOT sections fit. Other RAG signals are capped
124
+ # lower because they're supplementary; the spec is authoritative.
125
+ PRIO_SPEC=""
126
+ local SPEC_FILE="$HOME/.hermes/workspace/swarm-shared/specs/${PRIO_ID}.md"
127
+ [[ -f "$SPEC_FILE" ]] && PRIO_SPEC=$(/usr/bin/head -c 6000 "$SPEC_FILE")
128
+
129
+ # 9. Task-type authoritative sources — boost scraped knowledge based on title.
130
+ # Security task → CVE/MITRE/OWASP/Prowler. SRE → Google SRE/postmortems.
131
+ # Observability → OTel/Prometheus/Grafana/Honeycomb. etc.
132
+ # This is THE fix that makes all our scraping actually used by Hermes workers.
133
+ AUTHORITATIVE_CONTEXT=""
134
+ if [[ -f "$HOME/.surrogate/index.db" ]]; then
135
+ AUTHORITATIVE_CONTEXT=$(/usr/bin/python3 <<PYEOF
136
+ import sqlite3, re
137
+ title = """${PRIO_TITLE}""".lower()
138
+ project = """${PRIO_PROJECT}""".lower()
139
+ # Classify task → preferred source whitelist
140
+ routes = {
141
+ # Security tasks
142
+ ('security','cve','vuln','prowler','kyverno','opa','admission','ciem','sigma','mitre','attack','cosign','sbom','falco','threat','malware','exploit'): ['cisa-kev','mitre-attack','owasp-cheatsheet','domain:sec-cloudsec','domain:sec-appsec','domain:sec-devsecops','code-deep:sec-appsec','code-deep:sec-cloudsec'],
143
+ # SRE / incident / postmortem
144
+ ('sre','slo','sli','incident','postmortem','runbook','chaos','rca','dora','mttr','blameless','on-call','pager','outage'): ['google-sre','postmortems-index','firecrawl','eng-blog:charity-majors','eng-blog:high-scalability','mythos-ai-engineering','domain:ops-sre','code-deep:ops-sre'],
145
+ # Observability
146
+ ('observab','otel','telemetry','prometheus','grafana','loki','tempo','metric','trace','log','honeycomb','ebpf'): ['opentelemetry-spec','prometheus-docs','grafana-docs','firecrawl','domain:ops-observability'],
147
+ # Cloud / K8s / Terraform
148
+ ('kubernetes','k8s','helm','istio','terraform','aws','ecs','eks','lambda','cloudformation','cdk','gcp','azure','argocd','flux'): ['firecrawl','github-public','code-deep:ops-devops','domain:ops-devops','mythos-cloud','github-trending'],
149
+ # AI / multi-agent
150
+ ('agent','autogen','crewai','langgraph','orchestra','mcp','reflexion','dspy','rag','llm'): ['anthropic-cookbook','arxiv','mythos-ai-agent','mythos-ai-engineering','domain:ai-engineering','code-deep:ai-engineering','firecrawl','hf-papers'],
151
+ # FinOps
152
+ ('cost','finops','focus','rightsizing','kubecost','opencost','savings','budget','spend','waste'): ['firecrawl','rss','eng-blog:high-scalability','domain:ops-devops','arxiv'],
153
+ # Frontend / FE
154
+ ('frontend','react','nextjs','typescript','tsx','ui'): ['domain:dev-frontend','domain:design-ux','code-deep:dev-frontend','stackoverflow','github-trending'],
155
+ # Backend / API / DB
156
+ ('backend','api','fastapi','database','sql','postgres','asyncpg','sqlalchemy'): ['domain:dev-backend','domain:dev-fullstack','code-deep:dev-backend','github-public','stackoverflow','hf-papers'],
157
+ # Mobile
158
+ ('mobile','android','ios','flutter','reactnative','line','workio'): ['domain:dev-mobile','code-deep:dev-mobile','firecrawl','stackoverflow'],
159
+ }
160
+ # Project-specific boost
161
+ project_preferred = {
162
+ 'vanguard': ['cisa-kev','mitre-attack','owasp-cheatsheet','code-deep:sec-appsec'],
163
+ 'costinel': ['firecrawl','rss','arxiv','mythos-ai-engineering'],
164
+ 'arkship': ['google-sre','postmortems-index','anthropic-cookbook','opentelemetry-spec','firecrawl'],
165
+ 'surrogate':['arxiv','hf-papers','anthropic-cookbook','mythos-ai-agent'],
166
+ 'workio': ['firecrawl','stackoverflow','github-public'],
167
+ }
168
+
169
+ preferred_sources = set()
170
+ for keywords, srcs in routes.items():
171
+ if any(k in title for k in keywords):
172
+ preferred_sources.update(srcs)
173
+ for proj_key, srcs in project_preferred.items():
174
+ if proj_key in project:
175
+ preferred_sources.update(srcs)
176
+
177
+ if not preferred_sources:
178
+ print(''); exit()
179
+
180
+ # FTS query — prefer authoritative sources
181
+ conn = sqlite3.connect('$HOME/.surrogate/index.db')
182
+ conn.row_factory = sqlite3.Row
183
+ # Simple keyword from title
184
+ kw = ' '.join([w for w in re.sub(r'[^a-zA-Z0-9 ]', ' ', title).split() if len(w) > 3][:5])
185
+ if not kw: exit()
186
+
187
+ src_list = ','.join(f"'{s}'" for s in preferred_sources)
188
+ # Strategy: 3-tier fallback — preferred+match → any+match → preferred random
189
+ rows = []
190
+ try:
191
+ # Tier 1: preferred sources + FTS match on keywords
192
+ q = f"""SELECT d.source, d.instruction, substr(d.response, 1, 600) as body
193
+ FROM docs_fts f JOIN docs d ON d.id = f.rowid
194
+ WHERE f.docs_fts MATCH ? AND d.source IN ({src_list})
195
+ ORDER BY bm25(docs_fts) LIMIT 6"""
196
+ rows = conn.execute(q, (kw,)).fetchall()
197
+ except sqlite3.OperationalError: pass
198
+
199
+ if not rows:
200
+ # Tier 2: FTS match on ANY source — relax source filter
201
+ try:
202
+ q2 = """SELECT d.source, d.instruction, substr(d.response, 1, 600) as body
203
+ FROM docs_fts f JOIN docs d ON d.id = f.rowid
204
+ WHERE f.docs_fts MATCH ? ORDER BY bm25(docs_fts) LIMIT 6"""
205
+ rows = conn.execute(q2, (kw,)).fetchall()
206
+ except sqlite3.OperationalError: pass
207
+
208
+ if not rows:
209
+ # Tier 3: random sample from preferred sources (even if no keyword match)
210
+ rows = conn.execute(f"SELECT source, instruction, substr(response,1,600) as body FROM docs WHERE source IN ({src_list}) ORDER BY RANDOM() LIMIT 6").fetchall()
211
+
212
+ conn.close()
213
+
214
+ out = []
215
+ for r in rows:
216
+ out.append(f"[{r['source']}] {(r['instruction'] or '')[:120]}")
217
+ out.append((r['body'] or '')[:500])
218
+ out.append('')
219
+ print('\n'.join(out)[:3500])
220
+ PYEOF
221
+ )
222
+ fi
223
+
224
+ # 10. FalkorDB graph — related decisions + past priorities with similar theme
225
+ GRAPH_CONTEXT=""
226
+ local REDIS_SOCK=$(/usr/bin/find /var/folders /tmp -name 'redis.socket' -type s 2>/dev/null | /usr/bin/head -1)
227
+ if [[ -n "$REDIS_SOCK" ]]; then
228
+ # Get related priorities + learned rules
229
+ GRAPH_CONTEXT=$(/opt/homebrew/bin/redis-cli -s "$REDIS_SOCK" GRAPH.QUERY ashira "
230
+ MATCH (p:Priority {project: '$PRIO_PROJECT'})
231
+ OPTIONAL MATCH (p)-[:HAS_LEARNED_RULE]->(l:LearnedRule)
232
+ OPTIONAL MATCH (p)-[:COMMITTED_AS]->(c:Commit)
233
+ RETURN p.id, p.title, l.content, c.msg LIMIT 8
234
+ " 2>/dev/null | /usr/bin/tail -c 2500)
235
+ fi
236
+
237
+ # 11. Hermes trace recall — past similar tasks Hermes handled (from JSONL)
238
+ HERMES_RECALL=""
239
+ local TRACE_DIR="$HOME/axentx/surrogate/data/training-jsonl"
240
+ if [[ -d "$TRACE_DIR" ]]; then
241
+ HERMES_RECALL=$(/usr/bin/python3 <<PYEOF
242
+ import json, re, glob
243
+ title = """${PRIO_TITLE}""".lower()
244
+ words = [w for w in re.sub(r'[^a-zA-Z0-9 ]', ' ', title).split() if len(w) > 4][:4]
245
+ if not words: exit()
246
+
247
+ hits = []
248
+ # Walk recent hermes-trace-YYYY-MM-DD.jsonl files (last 7 days)
249
+ import os
250
+ files = sorted(glob.glob(os.path.expanduser('~/axentx/surrogate/data/training-jsonl/hermes-trace-*.jsonl')))[-7:]
251
+ for f in files:
252
+ try:
253
+ for line in open(f):
254
+ try: rec = json.loads(line)
255
+ except: continue
256
+ blob = (rec.get('instruction','') + ' ' + rec.get('output',''))[:2000].lower()
257
+ score = sum(1 for w in words if w in blob)
258
+ if score >= 2:
259
+ hits.append((score, rec))
260
+ except: pass
261
+
262
+ hits.sort(key=lambda x: -x[0])
263
+ for score, rec in hits[:3]:
264
+ print(f"HERMES PREVIOUSLY [{rec.get('category','?')}]: {rec.get('instruction','')[:120]}")
265
+ print(f"→ {rec.get('output','')[:400]}")
266
+ print()
267
+ PYEOF
268
+ )
269
+ fi
270
+ }
271
+
272
+ export -f build_rich_context
bin/lib/dns_fallback.py ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # DNS fallback helper — patches socket.getaddrinfo to use dig @8.8.8.8
2
+ # when system resolver fails (ISP DNS filtering certain AI endpoints).
3
+ # Import at top of any Python script: exec(open(...).read())
4
+ import socket as _sock
5
+ import subprocess as _sp
6
+
7
+ _orig_getaddrinfo = _sock.getaddrinfo
8
+
9
+ def _resilient_getaddrinfo(host, *args, **kwargs):
10
+ try:
11
+ return _orig_getaddrinfo(host, *args, **kwargs)
12
+ except _sock.gaierror:
13
+ # Fall back: resolve via public DNS (bypass ISP filtering)
14
+ for resolver in ("1.1.1.1", "8.8.8.8", "9.9.9.9"):
15
+ try:
16
+ out = _sp.check_output(
17
+ ["dig", "+short", "+time=3", "+tries=1", f"@{resolver}", host],
18
+ text=True, timeout=5, stderr=_sp.DEVNULL
19
+ ).strip().splitlines()
20
+ ip = next((ln for ln in out if ln and ln[0].isdigit()), None)
21
+ if ip:
22
+ return _orig_getaddrinfo(ip, *args, **kwargs)
23
+ except Exception:
24
+ continue
25
+ raise
26
+
27
+ _sock.getaddrinfo = _resilient_getaddrinfo
bin/lib/ground_truth.py ADDED
@@ -0,0 +1,280 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Ground-truth check — objective verification beyond reviewer opinion.
2
+
3
+ When task produces code, run external validators:
4
+ - Python: ast.parse (syntax) + optional ruff / mypy / pytest
5
+ - TypeScript/JS: tsc / eslint (if available)
6
+ - Terraform: terraform validate + tfsec (if available)
7
+ - CloudFormation: cfn-lint (if available)
8
+ - Shell: bash -n (syntax) + shellcheck (if available)
9
+ - JSON/YAML: parse check
10
+
11
+ Reviewer opinion + ground-truth = double check. Review says pass BUT compile
12
+ fails → overrides to fail.
13
+
14
+ Output: {"verdict": "pass|fail", "checks": [...], "blocking_failure": bool}
15
+ """
16
+
17
+ from __future__ import annotations
18
+
19
+ import ast
20
+ import json
21
+ import re
22
+ import shutil
23
+ import subprocess
24
+ import tempfile
25
+ from pathlib import Path
26
+ from typing import Optional
27
+
28
+ CODE_BLOCK_RE = re.compile(r"```(\w+)?\n(.*?)```", re.DOTALL)
29
+
30
+
31
+ def extract_code_blocks(text: str) -> list[tuple[str, str]]:
32
+ """Return list of (language, content) pairs from markdown fenced blocks."""
33
+ blocks = []
34
+ for m in CODE_BLOCK_RE.finditer(text):
35
+ lang = (m.group(1) or "").lower().strip()
36
+ content = m.group(2).strip()
37
+ if content:
38
+ blocks.append((lang, content))
39
+ return blocks
40
+
41
+
42
+ def _have(cmd: str) -> bool:
43
+ return shutil.which(cmd) is not None
44
+
45
+
46
+ def _run(cmd: list[str], stdin: Optional[str] = None, timeout: int = 30) -> tuple[int, str]:
47
+ try:
48
+ r = subprocess.run(
49
+ cmd, input=stdin, capture_output=True, text=True, timeout=timeout
50
+ )
51
+ return r.returncode, (r.stdout + r.stderr)[:2000]
52
+ except subprocess.TimeoutExpired:
53
+ return -1, "timeout"
54
+ except OSError as e:
55
+ return -1, str(e)
56
+
57
+
58
+ # ----------------------------------------------------------------------
59
+ # Per-language checkers
60
+ # ----------------------------------------------------------------------
61
+ def check_python(code: str) -> list[dict]:
62
+ out = []
63
+ # 1. syntax
64
+ try:
65
+ ast.parse(code)
66
+ out.append({"tool": "python-syntax", "pass": True, "msg": "syntactically valid"})
67
+ except SyntaxError as e:
68
+ out.append({"tool": "python-syntax", "pass": False,
69
+ "msg": f"SyntaxError: {e}", "blocking": True})
70
+ return out # no point in running linters
71
+ # 2. ruff (if installed)
72
+ if _have("ruff"):
73
+ with tempfile.NamedTemporaryFile("w", suffix=".py", delete=False) as f:
74
+ f.write(code)
75
+ path = f.name
76
+ try:
77
+ rc, output = _run(["ruff", "check", "--select=E,F", "--output-format=concise", path])
78
+ passed = rc == 0
79
+ out.append({"tool": "ruff", "pass": passed,
80
+ "msg": output[:500] if output else "clean"})
81
+ finally:
82
+ Path(path).unlink(missing_ok=True)
83
+ # 3. mypy (if installed, non-blocking)
84
+ if _have("mypy"):
85
+ with tempfile.NamedTemporaryFile("w", suffix=".py", delete=False) as f:
86
+ f.write(code)
87
+ path = f.name
88
+ try:
89
+ rc, output = _run(["mypy", "--no-error-summary", "--ignore-missing-imports", path])
90
+ out.append({"tool": "mypy", "pass": rc == 0, "msg": output[:500]})
91
+ finally:
92
+ Path(path).unlink(missing_ok=True)
93
+ return out
94
+
95
+
96
+ def check_typescript(code: str) -> list[dict]:
97
+ out = []
98
+ if not _have("npx") and not _have("tsc"):
99
+ return [{"tool": "typescript", "pass": True, "msg": "tsc/npx not installed — skipped"}]
100
+ with tempfile.NamedTemporaryFile("w", suffix=".ts", delete=False) as f:
101
+ f.write(code)
102
+ path = f.name
103
+ try:
104
+ cmd = (["tsc", "--noEmit", "--allowJs", "--target", "ES2022",
105
+ "--moduleResolution", "node", path] if _have("tsc")
106
+ else ["npx", "-y", "--package=typescript", "--",
107
+ "tsc", "--noEmit", "--target", "ES2022", path])
108
+ rc, output = _run(cmd, timeout=60)
109
+ out.append({"tool": "tsc", "pass": rc == 0,
110
+ "msg": output[:600] if output else "clean",
111
+ "blocking": rc != 0})
112
+ finally:
113
+ Path(path).unlink(missing_ok=True)
114
+ return out
115
+
116
+
117
+ def check_shell(code: str) -> list[dict]:
118
+ out = []
119
+ # bash -n (syntax only — no execution). Use file path; stdin parser is lenient.
120
+ with tempfile.NamedTemporaryFile("w", suffix=".sh", delete=False) as f:
121
+ f.write(code)
122
+ path = f.name
123
+ try:
124
+ rc, output = _run(["bash", "-n", path])
125
+ finally:
126
+ Path(path).unlink(missing_ok=True)
127
+ out.append({"tool": "bash-syntax", "pass": rc == 0, "msg": output or "valid",
128
+ "blocking": rc != 0})
129
+ if _have("shellcheck"):
130
+ with tempfile.NamedTemporaryFile("w", suffix=".sh", delete=False) as f:
131
+ f.write(code)
132
+ path = f.name
133
+ try:
134
+ rc, output = _run(["shellcheck", "-f", "gcc", path])
135
+ # shellcheck returns nonzero for warnings — non-blocking
136
+ out.append({"tool": "shellcheck", "pass": rc == 0, "msg": output[:500]})
137
+ finally:
138
+ Path(path).unlink(missing_ok=True)
139
+ return out
140
+
141
+
142
+ def check_terraform(code: str) -> list[dict]:
143
+ out = []
144
+ if not _have("terraform"):
145
+ return [{"tool": "terraform", "pass": True, "msg": "terraform not installed — skipped"}]
146
+ with tempfile.TemporaryDirectory() as d:
147
+ Path(d, "main.tf").write_text(code)
148
+ rc, output = _run(["terraform", "-chdir=" + d, "init", "-backend=false", "-input=false"], timeout=60)
149
+ if rc != 0:
150
+ out.append({"tool": "terraform-init", "pass": False, "msg": output[:500],
151
+ "blocking": True})
152
+ return out
153
+ rc, output = _run(["terraform", "-chdir=" + d, "validate"])
154
+ out.append({"tool": "terraform-validate", "pass": rc == 0,
155
+ "msg": output[:500] if output else "clean",
156
+ "blocking": rc != 0})
157
+ if _have("tfsec"):
158
+ rc, output = _run(["tfsec", d, "--no-color"])
159
+ out.append({"tool": "tfsec", "pass": rc == 0, "msg": output[:500]})
160
+ return out
161
+
162
+
163
+ def check_cloudformation(code: str) -> list[dict]:
164
+ if not _have("cfn-lint"):
165
+ return [{"tool": "cfn-lint", "pass": True, "msg": "cfn-lint not installed — skipped"}]
166
+ with tempfile.NamedTemporaryFile("w", suffix=".yaml", delete=False) as f:
167
+ f.write(code)
168
+ path = f.name
169
+ try:
170
+ rc, output = _run(["cfn-lint", path])
171
+ return [{"tool": "cfn-lint", "pass": rc == 0, "msg": output[:500],
172
+ "blocking": rc != 0}]
173
+ finally:
174
+ Path(path).unlink(missing_ok=True)
175
+
176
+
177
+ def check_json(code: str) -> list[dict]:
178
+ try:
179
+ json.loads(code)
180
+ return [{"tool": "json-parse", "pass": True, "msg": "valid JSON"}]
181
+ except json.JSONDecodeError as e:
182
+ return [{"tool": "json-parse", "pass": False, "msg": str(e), "blocking": True}]
183
+
184
+
185
+ def check_yaml(code: str) -> list[dict]:
186
+ try:
187
+ import yaml # type: ignore
188
+ except ImportError:
189
+ return [{"tool": "yaml-parse", "pass": True, "msg": "pyyaml not installed — skipped"}]
190
+ try:
191
+ yaml.safe_load(code)
192
+ return [{"tool": "yaml-parse", "pass": True, "msg": "valid YAML"}]
193
+ except yaml.YAMLError as e:
194
+ return [{"tool": "yaml-parse", "pass": False, "msg": str(e)[:300], "blocking": True}]
195
+
196
+
197
+ LANG_CHECKERS = {
198
+ "python": check_python, "py": check_python,
199
+ "typescript": check_typescript, "ts": check_typescript,
200
+ "javascript": check_typescript, "js": check_typescript,
201
+ "bash": check_shell, "sh": check_shell, "shell": check_shell,
202
+ "terraform": check_terraform, "hcl": check_terraform, "tf": check_terraform,
203
+ "cloudformation": check_cloudformation, "yaml": check_yaml, "yml": check_yaml,
204
+ "json": check_json,
205
+ }
206
+
207
+
208
+ # ----------------------------------------------------------------------
209
+ # Orchestrator
210
+ # ----------------------------------------------------------------------
211
+ def check(work_product: str) -> dict:
212
+ """Extract code blocks + run checkers. Returns aggregate verdict.
213
+
214
+ Returns:
215
+ {
216
+ "has_code": bool,
217
+ "verdict": "pass" | "fail",
218
+ "blocking_failure": bool,
219
+ "checks": [{tool, pass, msg, blocking?}, ...],
220
+ "blocks_checked": int,
221
+ }
222
+ """
223
+ blocks = extract_code_blocks(work_product)
224
+ all_checks: list[dict] = []
225
+ has_code = False
226
+
227
+ for lang, content in blocks:
228
+ checker = LANG_CHECKERS.get(lang)
229
+ if not checker:
230
+ continue
231
+ has_code = True
232
+ results = checker(content)
233
+ for r in results:
234
+ r["language"] = lang
235
+ all_checks.extend(results)
236
+
237
+ blocking_failure = any(c.get("blocking") and not c.get("pass") for c in all_checks)
238
+ # Only blocking checks determine pass/fail. Non-blocking (warn) tools like
239
+ # mypy or shellcheck can fail without sinking the verdict.
240
+ blocking_passed = all(c.get("pass") for c in all_checks if c.get("blocking"))
241
+ any_blocking = any(c.get("blocking") for c in all_checks)
242
+
243
+ if not has_code:
244
+ return {
245
+ "has_code": False,
246
+ "verdict": "pass", # nothing to check → don't block review
247
+ "blocking_failure": False,
248
+ "checks": [],
249
+ "blocks_checked": 0,
250
+ }
251
+
252
+ if blocking_failure:
253
+ verdict = "fail"
254
+ elif not any_blocking:
255
+ # no blocking checks ran (e.g. tools missing) — warn
256
+ verdict = "warn"
257
+ else:
258
+ # all blocking checks passed — non-blocking may still complain, but ship it
259
+ any_non_blocking_failed = any(
260
+ not c.get("pass") and not c.get("blocking") for c in all_checks
261
+ )
262
+ verdict = "warn" if any_non_blocking_failed else "pass"
263
+
264
+ return {
265
+ "has_code": True,
266
+ "verdict": verdict,
267
+ "blocking_failure": blocking_failure,
268
+ "checks": all_checks,
269
+ "blocks_checked": len(blocks),
270
+ }
271
+
272
+
273
+ if __name__ == "__main__":
274
+ import sys
275
+ if len(sys.argv) > 1:
276
+ text = Path(sys.argv[1]).read_text()
277
+ else:
278
+ text = sys.stdin.read()
279
+ result = check(text)
280
+ print(json.dumps(result, indent=2))
bin/lib/max_client.py ADDED
@@ -0,0 +1,365 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Claude Max plan OAuth client.
2
+
3
+ Handles:
4
+ - Read OAuth token from macOS keychain (`Claude Code-credentials`)
5
+ - Auto-refresh before expiry (lazy, on API call)
6
+ - Call Anthropic `/v1/messages` with OAuth Bearer
7
+ - Parse `anthropic-ratelimit-*` headers → quota state
8
+ - Cache quota state (5-min TTL) to avoid probing too often
9
+
10
+ Quota model (verified 2026-04-19):
11
+ Max plan uses UNIFIED pool — Opus + Sonnet share quota.
12
+ Haiku has separate pool (confirmed via live probe).
13
+ 5-hour window + 7-day window, both monitored.
14
+
15
+ Headers (from live response):
16
+ anthropic-ratelimit-unified-5h-status: allowed|rate_limited
17
+ anthropic-ratelimit-unified-5h-reset: <unix-ts>
18
+ anthropic-ratelimit-unified-5h-utilization: 0.0-1.0
19
+ anthropic-ratelimit-unified-7d-status
20
+ anthropic-ratelimit-unified-7d-reset
21
+ anthropic-ratelimit-unified-7d-utilization
22
+ """
23
+
24
+ from __future__ import annotations
25
+
26
+ import json
27
+ import os
28
+ import subprocess
29
+ import time
30
+ import urllib.error
31
+ import urllib.request
32
+ from dataclasses import dataclass, field
33
+ from pathlib import Path
34
+ from typing import Any, Optional
35
+
36
+ KEYCHAIN_SERVICE = "Claude Code-credentials"
37
+ OAUTH_REFRESH_URL = "https://claude.ai/v1/oauth/token"
38
+ OAUTH_CLIENT_ID = "9d1c250a-e61b-44d9-88ed-5944d1962f5e"
39
+ ANTHROPIC_API = "https://api.anthropic.com/v1/messages"
40
+ ANTHROPIC_BETA = "oauth-2025-04-20"
41
+ ANTHROPIC_VERSION = "2023-06-01"
42
+
43
+ QUOTA_CACHE_PATH = Path.home() / ".surrogate" / "yolo" / "max-quota.json"
44
+ QUOTA_CACHE_TTL = 300 # 5 minutes
45
+
46
+ # --- Model IDs (from live probe 2026-04-19) ---
47
+ MODEL_OPUS = "claude-opus-4-20250514"
48
+ MODEL_SONNET = "claude-sonnet-4-20250514"
49
+ MODEL_HAIKU = "claude-haiku-4-5-20251001"
50
+
51
+
52
+ @dataclass
53
+ class QuotaState:
54
+ """Rate-limit state parsed from response headers."""
55
+ model: str
56
+ status: str = "unknown" # allowed | rate_limited | unknown
57
+ reset_at: int = 0 # unix timestamp when window resets
58
+ utilization_5h: float = 0.0
59
+ utilization_7d: float = 0.0
60
+ last_checked: float = 0.0 # unix seconds
61
+ last_error: str = ""
62
+
63
+ @property
64
+ def available(self) -> bool:
65
+ return self.status == "allowed"
66
+
67
+ @property
68
+ def seconds_until_reset(self) -> int:
69
+ return max(0, int(self.reset_at - time.time()))
70
+
71
+
72
+ @dataclass
73
+ class MaxResponse:
74
+ """Successful response from Max plan."""
75
+ content: str
76
+ model_requested: str
77
+ model_served: str
78
+ input_tokens: int
79
+ output_tokens: int
80
+ quota: QuotaState = field(default_factory=lambda: QuotaState(model=""))
81
+
82
+
83
+ class MaxUnavailable(Exception):
84
+ """Raised when Max plan cannot serve the request (429 or auth)."""
85
+ def __init__(self, model: str, reset_at: int = 0, msg: str = ""):
86
+ self.model = model
87
+ self.reset_at = reset_at
88
+ self.msg = msg
89
+ super().__init__(f"Max {model} unavailable: {msg} (reset in {max(0, reset_at - int(time.time()))}s)")
90
+
91
+
92
+ class MaxAuthError(Exception):
93
+ """Raised when OAuth token refresh fails permanently — needs relogin."""
94
+
95
+
96
+ # ----------------------------------------------------------------------
97
+ # Keychain I/O
98
+ # ----------------------------------------------------------------------
99
+ def read_token() -> dict:
100
+ """Read full credential blob from keychain."""
101
+ try:
102
+ raw = subprocess.check_output(
103
+ ["security", "find-generic-password", "-s", KEYCHAIN_SERVICE, "-w"],
104
+ stderr=subprocess.DEVNULL,
105
+ ).decode().strip()
106
+ return json.loads(raw)
107
+ except subprocess.CalledProcessError:
108
+ raise MaxAuthError(f"Keychain entry '{KEYCHAIN_SERVICE}' not found — run `claude` to login")
109
+ except json.JSONDecodeError as e:
110
+ raise MaxAuthError(f"Invalid JSON in keychain: {e}")
111
+
112
+
113
+ def write_token(cred: dict) -> None:
114
+ """Atomically replace keychain entry."""
115
+ body = json.dumps(cred)
116
+ subprocess.run(
117
+ ["security", "delete-generic-password", "-s", KEYCHAIN_SERVICE],
118
+ stderr=subprocess.DEVNULL,
119
+ )
120
+ subprocess.run(
121
+ ["security", "add-generic-password",
122
+ "-s", KEYCHAIN_SERVICE,
123
+ "-a", os.environ.get("USER", "Ashira"),
124
+ "-w", body,
125
+ "-U"],
126
+ check=True,
127
+ stderr=subprocess.DEVNULL,
128
+ )
129
+
130
+
131
+ # ----------------------------------------------------------------------
132
+ # OAuth refresh
133
+ # ----------------------------------------------------------------------
134
+ def refresh_if_needed(cred: dict, buffer_seconds: int = 120) -> dict:
135
+ """Refresh access token if expiring in <buffer_seconds. Writes back to keychain."""
136
+ oa = cred["claudeAiOauth"]
137
+ expires_at = oa["expiresAt"] / 1000
138
+ if time.time() + buffer_seconds < expires_at:
139
+ return cred # still fresh
140
+
141
+ # Refresh
142
+ req = urllib.request.Request(
143
+ OAUTH_REFRESH_URL,
144
+ data=json.dumps({
145
+ "grant_type": "refresh_token",
146
+ "refresh_token": oa["refreshToken"],
147
+ "client_id": OAUTH_CLIENT_ID,
148
+ }).encode(),
149
+ headers={"content-type": "application/json"},
150
+ method="POST",
151
+ )
152
+ try:
153
+ with urllib.request.urlopen(req, timeout=15) as r:
154
+ new = json.loads(r.read())
155
+ except urllib.error.HTTPError as e:
156
+ raise MaxAuthError(
157
+ f"OAuth refresh failed ({e.code}): {e.read().decode()[:200]}. "
158
+ "Run `claude` in a new terminal to re-login."
159
+ )
160
+
161
+ oa["accessToken"] = new["access_token"]
162
+ oa["refreshToken"] = new["refresh_token"]
163
+ oa["expiresAt"] = int((time.time() + new["expires_in"]) * 1000)
164
+ write_token(cred)
165
+ return cred
166
+
167
+
168
+ # ----------------------------------------------------------------------
169
+ # Quota cache
170
+ # ----------------------------------------------------------------------
171
+ def load_quota_cache() -> dict[str, QuotaState]:
172
+ """Load cached quota state (per model)."""
173
+ if not QUOTA_CACHE_PATH.exists():
174
+ return {}
175
+ try:
176
+ raw = json.loads(QUOTA_CACHE_PATH.read_text())
177
+ return {k: QuotaState(**v) for k, v in raw.items()}
178
+ except (json.JSONDecodeError, TypeError):
179
+ return {}
180
+
181
+
182
+ def save_quota_cache(cache: dict[str, QuotaState]) -> None:
183
+ QUOTA_CACHE_PATH.parent.mkdir(parents=True, exist_ok=True)
184
+ data = {k: v.__dict__ for k, v in cache.items()}
185
+ QUOTA_CACHE_PATH.write_text(json.dumps(data, indent=2))
186
+
187
+
188
+ def parse_quota_headers(model: str, headers: dict[str, str]) -> QuotaState:
189
+ """Parse anthropic-ratelimit-* headers into QuotaState."""
190
+ h = {k.lower(): v for k, v in headers.items()}
191
+
192
+ def fget(key: str, default: float = 0.0) -> float:
193
+ try:
194
+ return float(h.get(key, default))
195
+ except (ValueError, TypeError):
196
+ return default
197
+
198
+ def iget(key: str, default: int = 0) -> int:
199
+ try:
200
+ return int(float(h.get(key, default)))
201
+ except (ValueError, TypeError):
202
+ return default
203
+
204
+ status = h.get("anthropic-ratelimit-unified-5h-status", "unknown")
205
+ reset_5h = iget("anthropic-ratelimit-unified-5h-reset")
206
+ reset_7d = iget("anthropic-ratelimit-unified-7d-reset")
207
+
208
+ return QuotaState(
209
+ model=model,
210
+ status=status,
211
+ reset_at=max(reset_5h, reset_7d) if reset_5h and reset_7d else reset_5h or reset_7d,
212
+ utilization_5h=fget("anthropic-ratelimit-unified-5h-utilization"),
213
+ utilization_7d=fget("anthropic-ratelimit-unified-7d-utilization"),
214
+ last_checked=time.time(),
215
+ )
216
+
217
+
218
+ # ----------------------------------------------------------------------
219
+ # Call Anthropic via Max OAuth
220
+ # ----------------------------------------------------------------------
221
+ def call_max(
222
+ model: str,
223
+ messages: list[dict],
224
+ max_tokens: int = 4096,
225
+ system: Optional[str] = None,
226
+ timeout: int = 180,
227
+ ) -> MaxResponse:
228
+ """Make a Max-plan OAuth call. Raises MaxUnavailable on 429."""
229
+ cred = refresh_if_needed(read_token())
230
+ token = cred["claudeAiOauth"]["accessToken"]
231
+
232
+ body: dict[str, Any] = {
233
+ "model": model,
234
+ "max_tokens": max_tokens,
235
+ "messages": messages,
236
+ }
237
+ if system:
238
+ body["system"] = system
239
+
240
+ req = urllib.request.Request(
241
+ ANTHROPIC_API,
242
+ data=json.dumps(body).encode(),
243
+ headers={
244
+ "Authorization": f"Bearer {token}",
245
+ "anthropic-version": ANTHROPIC_VERSION,
246
+ "anthropic-beta": ANTHROPIC_BETA,
247
+ "content-type": "application/json",
248
+ },
249
+ method="POST",
250
+ )
251
+ try:
252
+ with urllib.request.urlopen(req, timeout=timeout) as r:
253
+ data = json.loads(r.read())
254
+ quota = parse_quota_headers(model, dict(r.getheaders()))
255
+ _update_cache(quota)
256
+ return MaxResponse(
257
+ content=data["content"][0]["text"],
258
+ model_requested=model,
259
+ model_served=data.get("model", model),
260
+ input_tokens=data["usage"]["input_tokens"],
261
+ output_tokens=data["usage"]["output_tokens"],
262
+ quota=quota,
263
+ )
264
+ except urllib.error.HTTPError as e:
265
+ err_body = e.read().decode()
266
+ headers = dict(e.headers)
267
+ quota = parse_quota_headers(model, headers)
268
+ # Override: 429 always means rate_limited regardless of header contents
269
+ quota.status = "rate_limited" if e.code == 429 else "error"
270
+ quota.last_error = f"HTTP {e.code}: {err_body[:200]}"
271
+ # If 429 but no reset header, set a safe cooldown (5 min) so pick_max_model skips
272
+ if e.code == 429 and quota.reset_at <= time.time():
273
+ quota.reset_at = int(time.time() + 300)
274
+ _update_cache(quota)
275
+ if e.code == 429:
276
+ raise MaxUnavailable(model, quota.reset_at, err_body)
277
+ if e.code == 401:
278
+ raise MaxAuthError(f"Max auth failed ({e.code}) — relogin needed")
279
+ raise MaxUnavailable(model, 0, f"HTTP {e.code}: {err_body[:200]}")
280
+
281
+
282
+ def _update_cache(quota: QuotaState) -> None:
283
+ cache = load_quota_cache()
284
+ cache[quota.model] = quota
285
+ save_quota_cache(cache)
286
+
287
+
288
+ # ----------------------------------------------------------------------
289
+ # Tier selection
290
+ # ----------------------------------------------------------------------
291
+ MAX_TIER_ORDER = [MODEL_OPUS, MODEL_SONNET, MODEL_HAIKU]
292
+
293
+
294
+ def pick_max_model(prefer: str = MODEL_OPUS) -> Optional[str]:
295
+ """Pick best available Max-plan model.
296
+
297
+ Strategy:
298
+ 1. If cache status=allowed AND fresh (< TTL) → use it immediately
299
+ 2. If cache stale (> TTL) → eligible to re-probe (real probe will confirm)
300
+ 3. If cache rate_limited:
301
+ - If reset_at > 0 AND reset_at still in future → NOT eligible (honor cooldown)
302
+ - Only eligible when reset_at passed + cache went stale
303
+ 4. Walk Opus → Sonnet → Haiku; use first eligible
304
+
305
+ Returns model name or None if all rate-limited within cooldown.
306
+ """
307
+ cache = load_quota_cache()
308
+ now = time.time()
309
+
310
+ def eligible(model: str) -> bool:
311
+ q = cache.get(model)
312
+ if not q:
313
+ return True # unknown → worth one probe
314
+ # Fresh + allowed
315
+ if q.status == "allowed" and now - q.last_checked <= QUOTA_CACHE_TTL:
316
+ return True
317
+ # Rate-limited + still within cooldown window → skip
318
+ if q.status == "rate_limited" and q.reset_at > now:
319
+ return False
320
+ # Stale (either status) + no active cooldown → re-probe OK
321
+ if now - q.last_checked > QUOTA_CACHE_TTL:
322
+ return True
323
+ # Rate-limited but reset_at is 0 or in past → try again cautiously
324
+ if q.status == "rate_limited" and q.reset_at <= now:
325
+ return now - q.last_checked > 30 # wait 30s between retries
326
+ return False
327
+
328
+ order = [prefer] + [m for m in MAX_TIER_ORDER if m != prefer]
329
+ for model in order:
330
+ if eligible(model):
331
+ return model
332
+ return None
333
+
334
+
335
+ def probe_and_refresh_cache() -> dict[str, QuotaState]:
336
+ """Send minimal probes to each tier to refresh cache. Called every 5 min."""
337
+ out: dict[str, QuotaState] = {}
338
+ for model in MAX_TIER_ORDER:
339
+ try:
340
+ resp = call_max(model, [{"role": "user", "content": "."}], max_tokens=5)
341
+ out[model] = resp.quota
342
+ except MaxUnavailable as e:
343
+ # already cached in _update_cache
344
+ cache = load_quota_cache()
345
+ out[model] = cache.get(model, QuotaState(model=model, status="rate_limited",
346
+ reset_at=e.reset_at))
347
+ except MaxAuthError:
348
+ raise
349
+ return out
350
+
351
+
352
+ if __name__ == "__main__":
353
+ # CLI self-test
354
+ import sys
355
+ if len(sys.argv) > 1 and sys.argv[1] == "probe":
356
+ for model, q in probe_and_refresh_cache().items():
357
+ print(f"{model}: {q.status} util5h={q.utilization_5h:.2f} "
358
+ f"reset_in={q.seconds_until_reset}s")
359
+ elif len(sys.argv) > 1 and sys.argv[1] == "pick":
360
+ print(pick_max_model() or "NONE_AVAILABLE")
361
+ else:
362
+ # quick call
363
+ m = pick_max_model() or MODEL_HAIKU
364
+ r = call_max(m, [{"role": "user", "content": sys.argv[1] if len(sys.argv) > 1 else "hi"}], max_tokens=50)
365
+ print(f"[{r.model_served}] {r.content[:200]}")
bin/lib/openrouter_client.py ADDED
@@ -0,0 +1,195 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """OpenRouter client — free-first then paid tiers.
2
+
3
+ Tiers (per Ashira 2026-04-19):
4
+ FREE: qwen, gpt-oss, llama, nemotron, glm
5
+ CHEAP: deepseek-v3.2, grok-4.1-fast
6
+ PREMIUM: gpt-5.4, claude-haiku-4.5, claude-sonnet-4.6, claude-opus-4.7
7
+
8
+ Per-model cooldown tracked in ~/.surrogate/yolo/or-cooldowns.json to avoid
9
+ hammering rate-limited free models.
10
+ """
11
+
12
+ from __future__ import annotations
13
+
14
+ import json
15
+ import os
16
+ import time
17
+ import urllib.error
18
+ import urllib.request
19
+ from dataclasses import dataclass, field
20
+ from pathlib import Path
21
+ from typing import Optional
22
+
23
+ OR_URL = "https://openrouter.ai/api/v1/chat/completions"
24
+ COOLDOWN_PATH = Path.home() / ".surrogate" / "yolo" / "or-cooldowns.json"
25
+
26
+ FREE_MODELS = [
27
+ "qwen/qwen3-coder:free",
28
+ "openai/gpt-oss-120b:free",
29
+ "meta-llama/llama-3.3-70b-instruct:free",
30
+ "nvidia/nemotron-3-super-120b-a12b:free",
31
+ "z-ai/glm-4.5-air:free",
32
+ ]
33
+
34
+ CHEAP_MODELS = [
35
+ "deepseek/deepseek-v3.2",
36
+ "x-ai/grok-4.1-fast",
37
+ ]
38
+
39
+ PREMIUM_MODELS = [
40
+ "openai/gpt-5.4",
41
+ "anthropic/claude-haiku-4.5",
42
+ "anthropic/claude-sonnet-4.6",
43
+ "x-ai/grok-4.20",
44
+ "anthropic/claude-opus-4.7",
45
+ ]
46
+
47
+ DEFAULT_COOLDOWN_SECONDS = 60 # after 429, wait 60s before retrying this model
48
+
49
+
50
+ class ORUnavailable(Exception):
51
+ def __init__(self, model: str, code: int, body: str):
52
+ self.model = model
53
+ self.code = code
54
+ self.body = body
55
+ super().__init__(f"OR {model}: {code} {body[:200]}")
56
+
57
+
58
+ @dataclass
59
+ class ORResponse:
60
+ content: str
61
+ model_requested: str
62
+ model_served: str
63
+ input_tokens: int = 0
64
+ output_tokens: int = 0
65
+
66
+
67
+ def _load_cooldowns() -> dict[str, float]:
68
+ if not COOLDOWN_PATH.exists():
69
+ return {}
70
+ try:
71
+ return json.loads(COOLDOWN_PATH.read_text())
72
+ except (json.JSONDecodeError, OSError):
73
+ return {}
74
+
75
+
76
+ def _save_cooldowns(c: dict[str, float]) -> None:
77
+ COOLDOWN_PATH.parent.mkdir(parents=True, exist_ok=True)
78
+ COOLDOWN_PATH.write_text(json.dumps(c))
79
+
80
+
81
+ def is_on_cooldown(model: str) -> bool:
82
+ c = _load_cooldowns()
83
+ return c.get(model, 0) > time.time()
84
+
85
+
86
+ def mark_cooldown(model: str, seconds: int = DEFAULT_COOLDOWN_SECONDS) -> None:
87
+ c = _load_cooldowns()
88
+ c[model] = time.time() + seconds
89
+ # Prune expired entries
90
+ c = {k: v for k, v in c.items() if v > time.time()}
91
+ _save_cooldowns(c)
92
+
93
+
94
+ def call_openrouter(
95
+ model: str,
96
+ messages: list[dict],
97
+ max_tokens: int = 4000,
98
+ system: Optional[str] = None,
99
+ timeout: int = 120,
100
+ ) -> ORResponse:
101
+ """Call OpenRouter directly. Raises ORUnavailable on error."""
102
+ api_key = os.environ.get("OPENROUTER_API_KEY", "")
103
+ if not api_key:
104
+ # Try loading from .env (accepts both `KEY=val` and `export KEY=val` formats)
105
+ env_file = Path.home() / ".surrogate" / ".env"
106
+ if env_file.exists():
107
+ for line in env_file.read_text().splitlines():
108
+ s = line.strip()
109
+ if s.startswith("export "):
110
+ s = s[len("export "):].lstrip()
111
+ if s.startswith("OPENROUTER_API_KEY="):
112
+ api_key = s.split("=", 1)[1].strip().strip('"').strip("'")
113
+ break
114
+ if not api_key:
115
+ raise ORUnavailable(model, 0, "OPENROUTER_API_KEY not set")
116
+
117
+ body_msgs = list(messages)
118
+ if system:
119
+ body_msgs = [{"role": "system", "content": system}] + body_msgs
120
+
121
+ body = json.dumps({
122
+ "model": model,
123
+ "max_tokens": max_tokens,
124
+ "messages": body_msgs,
125
+ }).encode()
126
+
127
+ req = urllib.request.Request(
128
+ OR_URL,
129
+ data=body,
130
+ headers={
131
+ "Authorization": f"Bearer {api_key}",
132
+ "HTTP-Referer": "https://github.com/Ashira/axentx",
133
+ "X-Title": "axentx-smart-dispatcher",
134
+ "content-type": "application/json",
135
+ },
136
+ method="POST",
137
+ )
138
+ try:
139
+ with urllib.request.urlopen(req, timeout=timeout) as r:
140
+ data = json.loads(r.read())
141
+ if "choices" not in data:
142
+ raise ORUnavailable(model, 0, str(data)[:200])
143
+ choice = data["choices"][0]
144
+ content = choice["message"]["content"]
145
+ usage = data.get("usage", {})
146
+ return ORResponse(
147
+ content=content,
148
+ model_requested=model,
149
+ model_served=data.get("model", model),
150
+ input_tokens=usage.get("prompt_tokens", 0),
151
+ output_tokens=usage.get("completion_tokens", 0),
152
+ )
153
+ except urllib.error.HTTPError as e:
154
+ body = e.read().decode()
155
+ # 429 or 503 → mark cooldown
156
+ if e.code in (429, 503, 502):
157
+ mark_cooldown(model)
158
+ raise ORUnavailable(model, e.code, body)
159
+ except Exception as e: # network errors
160
+ raise ORUnavailable(model, 0, str(e))
161
+
162
+
163
+ def pick_free() -> Optional[str]:
164
+ """First free model not on cooldown."""
165
+ for m in FREE_MODELS:
166
+ if not is_on_cooldown(m):
167
+ return m
168
+ return None
169
+
170
+
171
+ def pick_cheap() -> Optional[str]:
172
+ for m in CHEAP_MODELS:
173
+ if not is_on_cooldown(m):
174
+ return m
175
+ return None
176
+
177
+
178
+ def pick_premium() -> Optional[str]:
179
+ for m in PREMIUM_MODELS:
180
+ if not is_on_cooldown(m):
181
+ return m
182
+ return None
183
+
184
+
185
+ if __name__ == "__main__":
186
+ import sys
187
+ if len(sys.argv) > 1 and sys.argv[1] == "pick":
188
+ print(f"free: {pick_free()}")
189
+ print(f"cheap: {pick_cheap()}")
190
+ print(f"premium: {pick_premium()}")
191
+ else:
192
+ m = pick_free() or pick_cheap() or pick_premium()
193
+ q = sys.argv[1] if len(sys.argv) > 1 else "say OK"
194
+ r = call_openrouter(m, [{"role": "user", "content": q}], max_tokens=30)
195
+ print(f"[{r.model_served}] {r.content[:100]}")
bin/lib/prompt_cache.py ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Anthropic prompt caching helper — adds cache_control to messages so repeated
2
+ system prompts / long contexts cost 10% of full price.
3
+ Usage: import this in any bridge that calls Anthropic API directly.
4
+ """
5
+ def add_cache_control(messages, threshold=2048):
6
+ """Add cache_control to the longest system message if it's over threshold chars.
7
+ Anthropic cache: https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching
8
+ Requires anthropic-beta: prompt-caching-2024-07-31 header."""
9
+ if not messages: return messages
10
+ for m in messages:
11
+ if m.get('role') == 'system' and isinstance(m.get('content'), str):
12
+ if len(m['content']) >= threshold:
13
+ # Convert to structured content with cache marker
14
+ m['content'] = [{'type': 'text', 'text': m['content'],
15
+ 'cache_control': {'type': 'ephemeral'}}]
16
+ break
17
+ return messages
bin/lib/review_agent.py ADDED
@@ -0,0 +1,328 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Review agent — tier-gated + consensus + ground-truth.
2
+
3
+ Replaces the simple review() in smart_dispatcher.py. Rules:
4
+
5
+ 1. Reviewer rank >= Writer rank (strict)
6
+ 2. Reviewer provider != Writer provider (cross-provider)
7
+ 3. For `critical=True` tasks: Reviewer rank >= Writer rank + 1, and 2-of-3 consensus
8
+ 4. If no eligible reviewer available RIGHT NOW → block (queue-wait),
9
+ retry when cache refreshes. DO NOT downgrade to lower tier.
10
+ 5. Ground-truth check runs alongside reviewer opinion:
11
+ code has blocking compile/parse failure → hard-fail regardless of reviewer
12
+ """
13
+
14
+ from __future__ import annotations
15
+
16
+ import json
17
+ import re
18
+ import sys
19
+ import time
20
+ from pathlib import Path
21
+ from typing import Optional
22
+
23
+ sys.path.insert(0, str(Path(__file__).parent))
24
+
25
+ from ground_truth import check as gt_check
26
+ from max_client import MAX_TIER_ORDER, MaxUnavailable, call_max, pick_max_model
27
+ from openrouter_client import (
28
+ CHEAP_MODELS,
29
+ FREE_MODELS,
30
+ PREMIUM_MODELS,
31
+ ORUnavailable,
32
+ call_openrouter,
33
+ is_on_cooldown,
34
+ )
35
+ from tier_rank import _provider_family, is_eligible_reviewer, pick_reviewer_from, rank
36
+
37
+
38
+ REVIEWER_SYSTEM = """You are a strict code review agent.
39
+
40
+ Your job:
41
+ 1. Check if the work fully addresses the task
42
+ 2. Check for correctness (syntax, logic, hallucinations)
43
+ 3. Check for completeness (edge cases, error handling)
44
+ 4. Rate severity of issues (low | med | high)
45
+
46
+ Output JSON only (no markdown, no prose):
47
+ {
48
+ "verdict": "pass" | "needs_revision",
49
+ "score": 0-10,
50
+ "issues": [{"severity":"low|med|high","desc":"..."}],
51
+ "suggestions": ["...", "..."],
52
+ "reasoning": "1-2 sentences"
53
+ }
54
+
55
+ Rules:
56
+ - Any "high" severity issue → always "needs_revision"
57
+ - If you detect hallucinated APIs/functions → "needs_revision" with severity=high
58
+ - Be rigorous — pass only when genuinely good
59
+ """
60
+
61
+
62
+ class NoEligibleReviewer(Exception):
63
+ """No reviewer currently available at required tier. Queue-wait."""
64
+
65
+
66
+ def _available_reviewers() -> list[str]:
67
+ """Enumerate all currently available reviewer candidates.
68
+
69
+ Max plan tiers (check quota) + OR tiers (check cooldowns).
70
+ """
71
+ cands: list[str] = []
72
+
73
+ # Max tiers (use pick_max_model to respect cache)
74
+ # We collect all three; caller picks based on tier
75
+ for m in MAX_TIER_ORDER:
76
+ # only include if not currently rate-limited long-term
77
+ from max_client import load_quota_cache
78
+ q = load_quota_cache().get(m)
79
+ if not q or q.status == "allowed" or q.seconds_until_reset < 60:
80
+ cands.append(m)
81
+
82
+ # OR tiers
83
+ for m in PREMIUM_MODELS + CHEAP_MODELS + FREE_MODELS:
84
+ if not is_on_cooldown(m):
85
+ cands.append(m)
86
+ return cands
87
+
88
+
89
+ def _call_model_for_review(model: str, prompt: str, system: str) -> tuple[str, str]:
90
+ """Route to Max or OR depending on model name. Returns (text, served_model_id)."""
91
+ if model in MAX_TIER_ORDER:
92
+ r = call_max(model, [{"role": "user", "content": prompt}],
93
+ max_tokens=1500, system=system, timeout=120)
94
+ return r.content, r.model_served
95
+ r = call_openrouter(model, [{"role": "user", "content": prompt}],
96
+ max_tokens=1500, system=system, timeout=120)
97
+ return r.content, r.model_served
98
+
99
+
100
+ def _parse_json_verdict(text: str) -> dict:
101
+ text = text.strip()
102
+ if text.startswith("```"):
103
+ text = text.split("```", 2)[1] if "```" in text[3:] else text[3:]
104
+ text = text.lstrip("json").lstrip()
105
+ if "```" in text:
106
+ text = text.rsplit("```", 1)[0]
107
+ try:
108
+ return json.loads(text)
109
+ except json.JSONDecodeError:
110
+ m = re.search(r"\{.*\}", text, re.DOTALL)
111
+ if m:
112
+ try:
113
+ return json.loads(m.group(0))
114
+ except json.JSONDecodeError:
115
+ pass
116
+ return {"verdict": "needs_revision", "reasoning": "review parse failed",
117
+ "raw": text[:500], "score": 0, "issues": [], "suggestions": []}
118
+
119
+
120
+ def review_once(
121
+ task_prompt: str,
122
+ work_product: str,
123
+ writer_model: str,
124
+ critical: bool = False,
125
+ queue_wait_max_seconds: int = 600,
126
+ poll_interval: int = 15,
127
+ ) -> dict:
128
+ """Single-reviewer review with tier enforcement.
129
+
130
+ Blocks (queue-wait) up to queue_wait_max_seconds if no eligible reviewer.
131
+ Raises NoEligibleReviewer after timeout.
132
+ """
133
+ deadline = time.time() + queue_wait_max_seconds
134
+
135
+ reviewer: Optional[str] = None
136
+ waits = 0
137
+ while time.time() < deadline:
138
+ cands = _available_reviewers()
139
+ reviewer = pick_reviewer_from(cands, writer_model, critical=critical)
140
+ if reviewer:
141
+ break
142
+ waits += 1
143
+ time.sleep(poll_interval)
144
+
145
+ if not reviewer:
146
+ raise NoEligibleReviewer(
147
+ f"no reviewer with rank>={rank(writer_model) + (1 if critical else 0)} "
148
+ f"and provider!={_provider_family(writer_model)} after {queue_wait_max_seconds}s"
149
+ )
150
+
151
+ review_prompt = f"""# TASK
152
+ {task_prompt}
153
+
154
+ # WORK PRODUCT
155
+ {work_product}
156
+
157
+ # YOUR REVIEW (valid JSON only):"""
158
+
159
+ try:
160
+ text, served = _call_model_for_review(reviewer, review_prompt, REVIEWER_SYSTEM)
161
+ except (MaxUnavailable, ORUnavailable) as e:
162
+ # Reviewer itself errored — retry with fresh pool
163
+ return {"verdict": "needs_revision", "reasoning": f"reviewer call failed: {e}",
164
+ "reviewer_model": reviewer, "score": 0,
165
+ "transport_error": True}
166
+
167
+ parsed = _parse_json_verdict(text)
168
+ parsed["reviewer_model"] = served
169
+ parsed["reviewer_provider_family"] = _provider_family(served)
170
+ parsed["reviewer_rank"] = rank(served)
171
+ parsed["writer_rank"] = rank(writer_model)
172
+ parsed["wait_cycles"] = waits
173
+ return parsed
174
+
175
+
176
+ def review_with_consensus(
177
+ task_prompt: str,
178
+ work_product: str,
179
+ writer_model: str,
180
+ num_reviewers: int = 3,
181
+ required_agree: int = 2,
182
+ critical: bool = True,
183
+ queue_wait_max_seconds: int = 600,
184
+ ) -> dict:
185
+ """Multi-reviewer consensus review. Used for critical tasks.
186
+
187
+ Picks N reviewers from DIFFERENT provider families (+ cross-provider from writer).
188
+ Verdict = pass if required_agree reviewers say "pass".
189
+ """
190
+ deadline = time.time() + queue_wait_max_seconds
191
+ reviewers: list[str] = []
192
+ used_families: set[str] = {_provider_family(writer_model)}
193
+
194
+ # Collect N reviewers from N distinct families
195
+ while len(reviewers) < num_reviewers and time.time() < deadline:
196
+ cands = _available_reviewers()
197
+ # Filter: eligible + family not yet used
198
+ new_picks: list[str] = []
199
+ for c in cands:
200
+ fam = _provider_family(c)
201
+ if fam in used_families:
202
+ continue
203
+ ok, _ = is_eligible_reviewer(writer_model, c, critical=critical)
204
+ if ok:
205
+ new_picks.append(c)
206
+ # Pick highest rank per family
207
+ by_family: dict[str, tuple[int, str]] = {}
208
+ for c in new_picks:
209
+ fam = _provider_family(c)
210
+ r = rank(c)
211
+ if fam not in by_family or by_family[fam][0] < r:
212
+ by_family[fam] = (r, c)
213
+ for fam, (_, model) in sorted(by_family.items(), key=lambda x: -x[1][0]):
214
+ if len(reviewers) >= num_reviewers:
215
+ break
216
+ reviewers.append(model)
217
+ used_families.add(fam)
218
+ if len(reviewers) < num_reviewers:
219
+ time.sleep(15)
220
+
221
+ if len(reviewers) < required_agree:
222
+ raise NoEligibleReviewer(
223
+ f"consensus needs {required_agree} distinct-family reviewers, got {len(reviewers)}"
224
+ )
225
+
226
+ # Fire reviews
227
+ individual_verdicts: list[dict] = []
228
+ for rv in reviewers:
229
+ try:
230
+ v = review_once(task_prompt, work_product, writer_model,
231
+ critical=critical, queue_wait_max_seconds=30)
232
+ # Force it to use THIS specific reviewer
233
+ # (review_once picks top; we need to override — run directly)
234
+ text, served = _call_model_for_review(
235
+ rv,
236
+ f"# TASK\n{task_prompt}\n\n# WORK PRODUCT\n{work_product}\n\n# YOUR REVIEW (JSON):",
237
+ REVIEWER_SYSTEM,
238
+ )
239
+ parsed = _parse_json_verdict(text)
240
+ parsed["reviewer_model"] = served
241
+ parsed["reviewer_rank"] = rank(served)
242
+ parsed["reviewer_provider_family"] = _provider_family(served)
243
+ individual_verdicts.append(parsed)
244
+ except (MaxUnavailable, ORUnavailable) as e:
245
+ individual_verdicts.append(
246
+ {"verdict": "needs_revision", "reasoning": f"reviewer error: {e}",
247
+ "reviewer_model": rv, "transport_error": True}
248
+ )
249
+
250
+ passes = sum(1 for v in individual_verdicts if v.get("verdict") == "pass")
251
+ consensus_verdict = "pass" if passes >= required_agree else "needs_revision"
252
+
253
+ # Aggregate issues from ALL reviewers (even if majority passes)
254
+ all_issues: list[dict] = []
255
+ all_suggestions: list[str] = []
256
+ for v in individual_verdicts:
257
+ all_issues.extend(v.get("issues", []) or [])
258
+ all_suggestions.extend(v.get("suggestions", []) or [])
259
+
260
+ return {
261
+ "verdict": consensus_verdict,
262
+ "consensus_pass_count": passes,
263
+ "consensus_required": required_agree,
264
+ "individual_verdicts": individual_verdicts,
265
+ "issues": all_issues,
266
+ "suggestions": all_suggestions,
267
+ "reviewers": [v.get("reviewer_model") for v in individual_verdicts],
268
+ "writer_rank": rank(writer_model),
269
+ "reasoning": f"consensus {passes}/{len(individual_verdicts)} pass (required {required_agree})",
270
+ }
271
+
272
+
273
+ def review_full(
274
+ task_prompt: str,
275
+ work_product: str,
276
+ writer_model: str,
277
+ critical: bool = False,
278
+ use_consensus: bool = False,
279
+ ) -> dict:
280
+ """Full review = reviewer opinion + ground-truth check.
281
+
282
+ Ground-truth BLOCKING failure → hard fail regardless of reviewer.
283
+ """
284
+ # 1. Ground-truth
285
+ gt = gt_check(work_product)
286
+
287
+ # 2. Reviewer opinion
288
+ if use_consensus:
289
+ reviewer = review_with_consensus(
290
+ task_prompt, work_product, writer_model,
291
+ num_reviewers=3, required_agree=2, critical=critical,
292
+ )
293
+ else:
294
+ reviewer = review_once(task_prompt, work_product, writer_model, critical=critical)
295
+
296
+ # 3. Combine
297
+ final_verdict = reviewer.get("verdict", "needs_revision")
298
+ if gt.get("blocking_failure"):
299
+ final_verdict = "needs_revision"
300
+
301
+ return {
302
+ "verdict": final_verdict,
303
+ "reviewer": reviewer,
304
+ "ground_truth": gt,
305
+ "override_by_ground_truth": gt.get("blocking_failure", False),
306
+ }
307
+
308
+
309
+ if __name__ == "__main__":
310
+ import sys
311
+ if len(sys.argv) < 3:
312
+ print("usage: review_agent.py <task-prompt> <work-product-file>")
313
+ sys.exit(1)
314
+ task = sys.argv[1]
315
+ work = Path(sys.argv[2]).read_text()
316
+ writer = sys.argv[3] if len(sys.argv) > 3 else "claude-haiku-4-5-20251001"
317
+ critical = "--critical" in sys.argv
318
+ consensus = "--consensus" in sys.argv
319
+ r = review_full(task, work, writer, critical=critical, use_consensus=consensus)
320
+ print(json.dumps({
321
+ "verdict": r["verdict"],
322
+ "ground_truth_verdict": r["ground_truth"]["verdict"],
323
+ "ground_truth_blocking": r["ground_truth"]["blocking_failure"],
324
+ "override_by_ground_truth": r["override_by_ground_truth"],
325
+ "reviewer_model": r["reviewer"].get("reviewer_model"),
326
+ "reviewer_rank": r["reviewer"].get("reviewer_rank"),
327
+ "reviewer_verdict": r["reviewer"].get("verdict"),
328
+ }, indent=2))
bin/lib/smart_dispatcher.py ADDED
@@ -0,0 +1,420 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Smart dispatcher — Max plan → OR free → OR paid with checkpoint + review.
2
+
3
+ Tier priority (per Ashira 2026-04-19):
4
+ 1. Max Opus 4.x (leverage flat-rate first)
5
+ 2. Max Sonnet 4.x (same plan, same pool typically)
6
+ 3. Max Haiku 4.x (cheapest Max tier)
7
+ 4. OR FREE models (qwen / gpt-oss / llama / nemotron / glm)
8
+ 5. OR CHEAP paid (deepseek / grok-fast)
9
+ 6. OR PREMIUM paid (gpt-5 / claude-opus / claude-sonnet via OR)
10
+
11
+ Continuous re-check: every 5 min probe Max tiers — if Opus/Sonnet come back
12
+ available, subsequent calls return to them (honor Max plan flat-rate).
13
+
14
+ Review retry: INFINITE per Ashira — runs revisions until reviewer passes.
15
+ """
16
+
17
+ from __future__ import annotations
18
+
19
+ import datetime as dt
20
+ import json
21
+ import sys
22
+ import time
23
+ from pathlib import Path
24
+ from typing import Callable, Optional
25
+
26
+ sys.path.insert(0, str(Path(__file__).parent))
27
+
28
+ from checkpoint import Checkpoint
29
+ from codebase_scanner import as_context_prompt, scan
30
+ from max_client import (
31
+ MAX_TIER_ORDER,
32
+ MODEL_HAIKU,
33
+ MODEL_OPUS,
34
+ MODEL_SONNET,
35
+ MaxAuthError,
36
+ MaxUnavailable,
37
+ call_max,
38
+ pick_max_model,
39
+ probe_and_refresh_cache,
40
+ )
41
+ from openrouter_client import (
42
+ CHEAP_MODELS,
43
+ FREE_MODELS,
44
+ PREMIUM_MODELS,
45
+ ORResponse,
46
+ ORUnavailable,
47
+ call_openrouter,
48
+ is_on_cooldown,
49
+ )
50
+ from review_agent import NoEligibleReviewer, review_full
51
+
52
+
53
+ LAST_MAX_PROBE: list[float] = [0.0]
54
+ MAX_PROBE_INTERVAL = 300 # 5 min
55
+
56
+
57
+ class DispatchResult:
58
+ def __init__(self, text: str, provider: str, model: str, input_tokens: int = 0, output_tokens: int = 0):
59
+ self.text = text
60
+ self.provider = provider
61
+ self.model = model
62
+ self.input_tokens = input_tokens
63
+ self.output_tokens = output_tokens
64
+
65
+
66
+ def _tier_iter() -> list[tuple[str, list[str]]]:
67
+ """Ordered tiers to try in strict priority."""
68
+ return [
69
+ ("max", MAX_TIER_ORDER),
70
+ ("or_free", FREE_MODELS),
71
+ ("or_cheap", CHEAP_MODELS),
72
+ ("or_premium", PREMIUM_MODELS),
73
+ ]
74
+
75
+
76
+ def _maybe_probe_max() -> None:
77
+ """Every 5 min, send minimal probes to each Max tier to refresh cache."""
78
+ if time.time() - LAST_MAX_PROBE[0] > MAX_PROBE_INTERVAL:
79
+ try:
80
+ probe_and_refresh_cache()
81
+ LAST_MAX_PROBE[0] = time.time()
82
+ except MaxAuthError:
83
+ pass # handled at call time
84
+
85
+
86
+ def dispatch(
87
+ prompt: str,
88
+ system: Optional[str] = None,
89
+ task_id: Optional[str] = None,
90
+ max_tokens: int = 4096,
91
+ checkpoint: Optional[Checkpoint] = None,
92
+ prefer_max: bool = True,
93
+ exclude_providers: set[str] | None = None,
94
+ on_attempt: Optional[Callable[[str, str], None]] = None,
95
+ ) -> DispatchResult:
96
+ """Try tiers in order until one succeeds. Logs to checkpoint.
97
+
98
+ Args:
99
+ prompt: user message
100
+ system: system prompt (optional)
101
+ task_id: for logging
102
+ max_tokens: output cap
103
+ checkpoint: Checkpoint instance for event logging
104
+ prefer_max: try Max first (True) — set False for review agent (cross-provider)
105
+ exclude_providers: skip these providers (e.g. {"max"} to force OR)
106
+ on_attempt: callback(provider, model) called per attempt (for debugging)
107
+
108
+ Returns DispatchResult or raises if ALL tiers exhausted.
109
+ """
110
+ exclude = exclude_providers or set()
111
+ messages = [{"role": "user", "content": prompt}]
112
+ _maybe_probe_max()
113
+
114
+ tiers = _tier_iter()
115
+ if not prefer_max:
116
+ tiers = [t for t in tiers if t[0] != "max"]
117
+
118
+ errors: list[str] = []
119
+
120
+ for tier_name, models in tiers:
121
+ if tier_name in exclude:
122
+ continue
123
+
124
+ if tier_name == "max":
125
+ m = pick_max_model()
126
+ if m is None:
127
+ errors.append("max: all tiers rate-limited")
128
+ continue
129
+ if on_attempt:
130
+ on_attempt("max", m)
131
+ if checkpoint:
132
+ checkpoint.append("provider_selected", provider="max", model=m)
133
+ try:
134
+ r = call_max(m, messages, max_tokens=max_tokens, system=system)
135
+ if checkpoint:
136
+ checkpoint.append("provider_success", provider="max", model=m,
137
+ content_preview=r.content[:200],
138
+ input_tokens=r.input_tokens,
139
+ output_tokens=r.output_tokens)
140
+ return DispatchResult(r.content, "max", m, r.input_tokens, r.output_tokens)
141
+ except MaxUnavailable as e:
142
+ errors.append(f"max:{m} 429 (reset {e.reset_at})")
143
+ if checkpoint:
144
+ checkpoint.append("provider_failed", provider="max", model=m,
145
+ reason=f"rate_limit reset_at={e.reset_at}")
146
+ continue
147
+ except MaxAuthError as e:
148
+ errors.append(f"max auth: {e}")
149
+ if checkpoint:
150
+ checkpoint.append("provider_failed", provider="max", reason=f"auth: {e}")
151
+ # Max totally broken — skip tier but keep going with OR
152
+ continue
153
+ else:
154
+ # OR tier
155
+ for m in models:
156
+ if is_on_cooldown(m):
157
+ continue
158
+ if on_attempt:
159
+ on_attempt(tier_name, m)
160
+ if checkpoint:
161
+ checkpoint.append("provider_selected", provider=tier_name, model=m)
162
+ try:
163
+ r = call_openrouter(m, messages, max_tokens=max_tokens, system=system)
164
+ if checkpoint:
165
+ checkpoint.append("provider_success", provider=tier_name, model=m,
166
+ content_preview=r.content[:200],
167
+ input_tokens=r.input_tokens,
168
+ output_tokens=r.output_tokens)
169
+ return DispatchResult(r.content, tier_name, m, r.input_tokens, r.output_tokens)
170
+ except ORUnavailable as e:
171
+ errors.append(f"{tier_name}:{m} {e.code}")
172
+ if checkpoint:
173
+ checkpoint.append("provider_failed", provider=tier_name, model=m,
174
+ reason=f"{e.code}: {e.body[:100]}")
175
+ continue
176
+
177
+ # All tiers exhausted
178
+ raise RuntimeError(f"all providers exhausted: {errors}")
179
+
180
+
181
+ # ----------------------------------------------------------------------
182
+ # Review agent (cross-provider debate)
183
+ # ----------------------------------------------------------------------
184
+ REVIEWER_SYSTEM = """You are a strict code review agent. You review another AI's work for a given task.
185
+ Your job:
186
+ 1. Check if the work fully addresses the task
187
+ 2. Check for correctness (syntax, logic, hallucinations)
188
+ 3. Check for completeness (edge cases, error handling)
189
+ 4. Rate severity of issues
190
+
191
+ Output JSON only, no prose:
192
+ {
193
+ "verdict": "pass" | "needs_revision",
194
+ "score": 0-10,
195
+ "issues": [{"severity":"low|med|high","desc":"..."}],
196
+ "suggestions": ["...", "..."],
197
+ "reasoning": "1-2 sentences"
198
+ }
199
+
200
+ If no issues, "pass". If ANY "high" severity issue → always "needs_revision"."""
201
+
202
+
203
+ def review(
204
+ task_prompt: str,
205
+ work_product: str,
206
+ writer_provider: str,
207
+ checkpoint: Optional[Checkpoint] = None,
208
+ ) -> dict:
209
+ """Send work for cross-provider review. Uses different provider than writer.
210
+
211
+ Returns:
212
+ {"verdict": "pass|needs_revision", "score": int, "issues": [...],
213
+ "suggestions": [...], "reasoning": "...", "reviewer_model": "..."}
214
+ """
215
+ # Cross-provider: if writer was Max/Anthropic → reviewer from OR non-Anthropic
216
+ exclude = set()
217
+ if writer_provider == "max":
218
+ exclude.add("max") # reviewer uses OR
219
+
220
+ review_prompt = f"""# TASK ORIGINAL
221
+ {task_prompt}
222
+
223
+ # WORK PRODUCT TO REVIEW
224
+ {work_product}
225
+
226
+ # YOUR REVIEW (JSON only):"""
227
+
228
+ if checkpoint:
229
+ checkpoint.append("review_requested", writer_provider=writer_provider)
230
+
231
+ result = dispatch(
232
+ prompt=review_prompt,
233
+ system=REVIEWER_SYSTEM,
234
+ checkpoint=checkpoint,
235
+ max_tokens=1500,
236
+ exclude_providers=exclude,
237
+ prefer_max=(writer_provider != "max"),
238
+ )
239
+
240
+ # Parse JSON from response
241
+ text = result.text.strip()
242
+ # Strip markdown fence
243
+ if text.startswith("```"):
244
+ text = text.split("```", 2)[1] if "```" in text[3:] else text[3:]
245
+ text = text.lstrip("json").lstrip()
246
+ if "```" in text:
247
+ text = text.rsplit("```", 1)[0]
248
+ try:
249
+ parsed = json.loads(text)
250
+ except json.JSONDecodeError:
251
+ # Look for {...} block
252
+ import re
253
+ m = re.search(r"\{.*\}", text, re.DOTALL)
254
+ if m:
255
+ try:
256
+ parsed = json.loads(m.group(0))
257
+ except json.JSONDecodeError:
258
+ parsed = {"verdict": "needs_revision", "reasoning": "review parse failed",
259
+ "raw": text[:500]}
260
+ else:
261
+ parsed = {"verdict": "needs_revision", "reasoning": "review parse failed",
262
+ "raw": text[:500]}
263
+
264
+ parsed["reviewer_provider"] = result.provider
265
+ parsed["reviewer_model"] = result.model
266
+ if checkpoint:
267
+ checkpoint.append("review_verdict", **parsed)
268
+ return parsed
269
+
270
+
271
+ # ----------------------------------------------------------------------
272
+ # Full orchestration
273
+ # ----------------------------------------------------------------------
274
+ def execute_task(
275
+ task_id: str,
276
+ prompt: str,
277
+ system_base: str = "",
278
+ max_tokens: int = 4096,
279
+ max_review_iterations: int = 0, # 0 = infinite (per Ashira)
280
+ codebase_artifacts: list[str] | None = None,
281
+ critical: bool = False, # True → reviewer rank > writer + consensus 2/3
282
+ use_consensus: bool = False, # True → 2-of-3 reviewers vote
283
+ ) -> dict:
284
+ """End-to-end: scan codebase → dispatch → review → revise until pass.
285
+
286
+ Returns: {"task_id","final_text","iterations","reviewer_verdict",...}
287
+ """
288
+ cp = Checkpoint.open(task_id)
289
+
290
+ # Resume support
291
+ existing_state = cp.resume_state()
292
+ iteration = existing_state["review_iterations"]
293
+ draft = existing_state["draft_text"]
294
+ if existing_state["completed"]:
295
+ return {"task_id": task_id, "status": "already_done",
296
+ "final_text": draft, "iterations": iteration}
297
+
298
+ if not existing_state["started"]:
299
+ cp.append("task_start", prompt=prompt[:500])
300
+
301
+ # Phase 1: codebase review
302
+ report = scan(prompt, codebase_artifacts)
303
+ cp.append("codebase_review",
304
+ artifacts=[f["path"] for f in report["recent_files"][:15]],
305
+ uncommitted_repos=len(report["uncommitted_repos"]),
306
+ semantic_hits=len(report["semantic_hits"]))
307
+ codebase_ctx = as_context_prompt(report, 6000)
308
+ system = (system_base + "\n\n" + codebase_ctx).strip()
309
+ else:
310
+ # Resume: re-scan codebase (may have changed)
311
+ report = scan(prompt, codebase_artifacts)
312
+ cp.append("codebase_review",
313
+ artifacts=[f["path"] for f in report["recent_files"][:15]],
314
+ resumed=True)
315
+ codebase_ctx = as_context_prompt(report, 6000)
316
+ system = (system_base + "\n\n" + codebase_ctx).strip()
317
+ # Include prior draft as context for continuation
318
+ if draft:
319
+ system += f"\n\n## Previous attempt (continue/refine this):\n{draft[:3000]}"
320
+
321
+ # Phase 2: dispatch + review loop
322
+ last_review: dict | None = None
323
+ accumulated_feedback = ""
324
+
325
+ while True:
326
+ iteration += 1
327
+ iter_prompt = prompt
328
+ if accumulated_feedback:
329
+ iter_prompt = f"{prompt}\n\n## Reviewer feedback from prior iteration (address these):\n{accumulated_feedback}"
330
+
331
+ result = dispatch(
332
+ prompt=iter_prompt,
333
+ system=system,
334
+ checkpoint=cp,
335
+ max_tokens=max_tokens,
336
+ )
337
+ draft = result.text
338
+ cp.append("result_draft", text=draft, iteration=iteration,
339
+ provider=result.provider, model=result.model)
340
+
341
+ # Review — tier-enforced + ground-truth via review_agent.review_full
342
+ try:
343
+ full_review = review_full(
344
+ task_prompt=prompt,
345
+ work_product=draft,
346
+ writer_model=result.model,
347
+ critical=critical,
348
+ use_consensus=use_consensus or critical,
349
+ )
350
+ cp.append("review_full",
351
+ verdict=full_review["verdict"],
352
+ reviewer_model=full_review["reviewer"].get("reviewer_model"),
353
+ reviewer_rank=full_review["reviewer"].get("reviewer_rank"),
354
+ writer_rank=full_review["reviewer"].get("writer_rank"),
355
+ ground_truth_verdict=full_review["ground_truth"]["verdict"],
356
+ ground_truth_blocking=full_review["ground_truth"]["blocking_failure"],
357
+ override_by_ground_truth=full_review["override_by_ground_truth"])
358
+ last_review = dict(full_review["reviewer"])
359
+ last_review["verdict"] = full_review["verdict"]
360
+ last_review["ground_truth"] = full_review["ground_truth"]
361
+ except NoEligibleReviewer as e:
362
+ cp.append("review_blocked", reason=str(e))
363
+ # Queue-wait: don't consume iteration, poll + retry
364
+ time.sleep(30)
365
+ iteration -= 1
366
+ continue
367
+
368
+ verdict = last_review.get("verdict", "needs_revision")
369
+ if verdict == "pass":
370
+ cp.append("task_done", iteration=iteration, final_length=len(draft))
371
+ cp.archive()
372
+ return {
373
+ "task_id": task_id,
374
+ "status": "done",
375
+ "final_text": draft,
376
+ "iterations": iteration,
377
+ "last_review": last_review,
378
+ "writer": f"{result.provider}/{result.model}",
379
+ }
380
+
381
+ # needs_revision — assemble feedback
382
+ issues = last_review.get("issues", [])
383
+ suggestions = last_review.get("suggestions", [])
384
+ fb_lines = []
385
+ for i in issues:
386
+ fb_lines.append(f"- [{i.get('severity','?')}] {i.get('desc','')}")
387
+ for s in suggestions:
388
+ fb_lines.append(f"- {s}")
389
+ accumulated_feedback = "\n".join(fb_lines) if fb_lines else last_review.get("reasoning", "")
390
+ cp.append("revision_requested", iteration=iteration,
391
+ feedback=accumulated_feedback[:500])
392
+
393
+ # Safety: if max_review_iterations > 0, enforce it. 0 = infinite.
394
+ if max_review_iterations > 0 and iteration >= max_review_iterations:
395
+ cp.append("task_failed", reason=f"max_iterations_{max_review_iterations}")
396
+ cp.archive()
397
+ return {
398
+ "task_id": task_id,
399
+ "status": "failed_max_iter",
400
+ "final_text": draft,
401
+ "iterations": iteration,
402
+ "last_review": last_review,
403
+ }
404
+
405
+
406
+ if __name__ == "__main__":
407
+ import uuid
408
+ if len(sys.argv) < 2:
409
+ print("usage: smart_dispatcher.py <prompt>")
410
+ sys.exit(1)
411
+ task_id = "adhoc-" + uuid.uuid4().hex[:8]
412
+ prompt = " ".join(sys.argv[1:])
413
+ r = execute_task(task_id, prompt, max_tokens=500)
414
+ print(json.dumps({
415
+ "task_id": r["task_id"],
416
+ "status": r["status"],
417
+ "iterations": r["iterations"],
418
+ "writer": r.get("writer"),
419
+ "preview": r["final_text"][:400],
420
+ }, indent=2))
bin/lib/tier_rank.py ADDED
@@ -0,0 +1,192 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Model tier rank — enforces "reviewer >= writer" quality rule.
2
+
3
+ Rank scale (1-10, approximate SWE-Bench Verified + LMArena Q1 2026):
4
+ 10 Claude Opus 4.7, GPT-5.4
5
+ 9 Claude Sonnet 4.6, GPT-5.4-pro, Grok 4.20, Gemini 3.1 Pro
6
+ 8 Claude Opus 4.6, DeepSeek V3.2 (coding strong)
7
+ 7 Claude Haiku 4.5, Grok 4.1 Fast, Qwen 3.6 35B-MoE
8
+ 6 Llama 3.3 70B, Mistral Large 3, Kimi K2.5, Qwen 3.5 Coder 32B
9
+ 5 Nemotron 120B, GLM 4.5 Air, Qwen 3.5 Coder 14B
10
+ 4 GPT-OSS 120B, Gemma 4 31B
11
+ 3 GPT-OSS 20B, Llama 3.3 8B, small local
12
+
13
+ Policy (per Ashira 2026-04-19):
14
+ - Reviewer tier MUST be >= writer tier.
15
+ - For code/IaC/security tasks, prefer reviewer tier > writer by 1.
16
+ - If no eligible reviewer available → queue-wait (DO NOT downgrade writer).
17
+ """
18
+
19
+ from __future__ import annotations
20
+
21
+ TIER_RANK: dict[str, int] = {
22
+ # === 10: frontier ===
23
+ "anthropic/claude-opus-4.7": 10,
24
+ "openai/gpt-5.4": 10,
25
+ "openrouter/anthropic/claude-opus-4.7": 10,
26
+ "openrouter/openai/gpt-5.4": 10,
27
+
28
+ # === 9: premium ===
29
+ "anthropic/claude-sonnet-4.6": 9,
30
+ "openai/gpt-5.4-pro": 9,
31
+ "x-ai/grok-4.20": 9,
32
+ "google/gemini-3.1-pro": 9,
33
+ "openrouter/anthropic/claude-sonnet-4.6": 9,
34
+ "openrouter/x-ai/grok-4.20": 9,
35
+ # Max-plan native (OAuth)
36
+ "claude-opus-4-20250514": 9, # Opus 4 (Max plan native)
37
+ "claude-sonnet-4-20250514": 9, # Sonnet 4 (Max plan native)
38
+
39
+ # === 8: strong ===
40
+ "anthropic/claude-opus-4.6": 8,
41
+ "deepseek/deepseek-v3.2": 8,
42
+ "openrouter/deepseek/deepseek-v3.2": 8,
43
+
44
+ # === 7: capable ===
45
+ "anthropic/claude-haiku-4.5": 7,
46
+ "x-ai/grok-4.1-fast": 7,
47
+ "openrouter/anthropic/claude-haiku-4.5": 7,
48
+ "openrouter/x-ai/grok-4.1-fast": 7,
49
+ "claude-haiku-4-5-20251001": 7, # Haiku 4.5 (Max plan native)
50
+ "qwen/qwen3.6-35b-a3b": 7,
51
+ "openrouter/qwen/qwen3.6-35b-a3b": 7,
52
+
53
+ # === 6: mid ===
54
+ "meta-llama/llama-3.3-70b-instruct": 6,
55
+ "qwen/qwen3-next-80b-a3b-instruct": 6,
56
+ "qwen/qwen3-coder": 6,
57
+ "moonshotai/kimi-k2.5": 6,
58
+ "mistral-large-3": 6,
59
+
60
+ # === 5: weak-mid ===
61
+ "nvidia/nemotron-3-super-120b-a12b": 5,
62
+ "z-ai/glm-4.5-air": 5,
63
+
64
+ # === 4: small ===
65
+ "openai/gpt-oss-120b": 4,
66
+ "google/gemma-4-31b-it": 4,
67
+
68
+ # === 3: tiny / free ===
69
+ "openai/gpt-oss-20b": 3,
70
+ "meta-llama/llama-3.3-8b-instruct": 3,
71
+ }
72
+
73
+
74
+ def rank(model: str) -> int:
75
+ """Return rank 1-10, defaulting to 5 for unknown models."""
76
+ if not model:
77
+ return 5
78
+ # Strip :free suffix
79
+ base = model.replace(":free", "").strip("/")
80
+ if base in TIER_RANK:
81
+ return TIER_RANK[base]
82
+ # Try progressively stripping path components
83
+ for prefix in ("openrouter/", ""):
84
+ for candidate in [prefix + base, base.replace(prefix, "")]:
85
+ if candidate in TIER_RANK:
86
+ return TIER_RANK[candidate]
87
+ # Partial match (last-resort — for unknown variants of known families)
88
+ lower = base.lower()
89
+ if "opus-4.7" in lower or "opus-4-7" in lower: return 10
90
+ if "gpt-5.4" in lower and "mini" not in lower and "nano" not in lower: return 10
91
+ if "sonnet-4.6" in lower or "sonnet-4-6" in lower: return 9
92
+ if "opus-4" in lower or "opus_4" in lower: return 8
93
+ if "grok-4.2" in lower: return 9
94
+ if "gemini-3" in lower and "flash" not in lower: return 9
95
+ if "haiku-4" in lower: return 7
96
+ if "deepseek-v3" in lower: return 8
97
+ if "grok-4.1" in lower or "grok-fast" in lower: return 7
98
+ if "qwen3.6" in lower: return 7
99
+ if "llama-3.3-70" in lower: return 6
100
+ if "nemotron" in lower: return 5
101
+ if "glm-4.5" in lower: return 5
102
+ if "gpt-oss-120" in lower: return 4
103
+ if "gemma-4-31" in lower: return 4
104
+ if "gpt-oss-20" in lower: return 3
105
+ return 5
106
+
107
+
108
+ def is_eligible_reviewer(writer_model: str, reviewer_model: str,
109
+ critical: bool = False,
110
+ cross_provider_required: bool = True) -> tuple[bool, str]:
111
+ """Check if reviewer qualifies.
112
+
113
+ Rules:
114
+ 1. rank(reviewer) >= rank(writer) [always]
115
+ 2. rank(reviewer) >= rank(writer) + 1 [when critical]
116
+ 3. reviewer provider != writer provider [when cross_provider_required]
117
+
118
+ Returns (ok, reason).
119
+ """
120
+ wr = rank(writer_model)
121
+ rr = rank(reviewer_model)
122
+ min_rank = wr + 1 if critical else wr
123
+
124
+ if rr < min_rank:
125
+ return False, f"reviewer rank {rr} < required {min_rank} (writer={wr})"
126
+
127
+ if cross_provider_required:
128
+ wp = _provider_family(writer_model)
129
+ rp = _provider_family(reviewer_model)
130
+ if wp == rp and wp != "unknown":
131
+ return False, f"same provider family '{wp}' — need cross-provider"
132
+
133
+ return True, f"ok: rank {rr} >= {min_rank}, cross-provider satisfied"
134
+
135
+
136
+ def _provider_family(model: str) -> str:
137
+ """Group models by maker for cross-provider check."""
138
+ m = model.lower()
139
+ if "claude" in m or "anthropic" in m:
140
+ return "anthropic"
141
+ if "gpt-" in m or "openai" in m or "gpt_" in m:
142
+ return "openai"
143
+ if "gemini" in m or "gemma" in m:
144
+ return "google"
145
+ if "grok" in m or "x-ai" in m:
146
+ return "xai"
147
+ if "deepseek" in m:
148
+ return "deepseek"
149
+ if "qwen" in m:
150
+ return "qwen"
151
+ if "llama" in m or "meta" in m:
152
+ return "meta"
153
+ if "kimi" in m or "moonshot" in m:
154
+ return "moonshot"
155
+ if "mistral" in m:
156
+ return "mistral"
157
+ if "nemotron" in m or "nvidia" in m:
158
+ return "nvidia"
159
+ if "glm" in m or "z-ai" in m:
160
+ return "zai"
161
+ return "unknown"
162
+
163
+
164
+ def pick_reviewer_from(candidates: list[str], writer_model: str,
165
+ critical: bool = False) -> str | None:
166
+ """Pick highest-rank eligible reviewer from a list of available models."""
167
+ scored: list[tuple[int, str]] = []
168
+ for c in candidates:
169
+ ok, _ = is_eligible_reviewer(writer_model, c, critical=critical)
170
+ if ok:
171
+ scored.append((rank(c), c))
172
+ if not scored:
173
+ return None
174
+ scored.sort(key=lambda x: -x[0])
175
+ return scored[0][1]
176
+
177
+
178
+ if __name__ == "__main__":
179
+ import sys
180
+ if len(sys.argv) >= 3:
181
+ w, r = sys.argv[1], sys.argv[2]
182
+ crit = "--critical" in sys.argv
183
+ ok, reason = is_eligible_reviewer(w, r, critical=crit)
184
+ print(f"writer={w} rank={rank(w)}")
185
+ print(f"reviewer={r} rank={rank(r)}")
186
+ print(f"eligible={ok}: {reason}")
187
+ else:
188
+ for m in ["claude-opus-4-20250514", "claude-sonnet-4-20250514",
189
+ "claude-haiku-4-5-20251001", "openai/gpt-5.4",
190
+ "deepseek/deepseek-v3.2", "openai/gpt-oss-120b:free",
191
+ "qwen/qwen3-coder:free", "meta-llama/llama-3.3-70b-instruct:free"]:
192
+ print(f" rank({m}) = {rank(m)} [{_provider_family(m)}]")
bin/notify-discord.sh CHANGED
@@ -10,7 +10,7 @@
10
  # Examples:
11
  # notify-discord.sh success "Task done" "p42 completed in 180s"
12
  # notify-discord.sh error "Daemon crashed" "qwen-coder exit 1"
13
- # tail -50 ~/.claude/logs/scrape.log | notify-discord.sh scrape "Scrape report"
14
  set -u
15
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
16
 
 
10
  # Examples:
11
  # notify-discord.sh success "Task done" "p42 completed in 180s"
12
  # notify-discord.sh error "Daemon crashed" "qwen-coder exit 1"
13
+ # tail -50 ~/.surrogate/logs/scrape.log | notify-discord.sh scrape "Scrape report"
14
  set -u
15
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
16
 
bin/nvidia-bridge.sh ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # NVIDIA NIM bridge — OpenAI-compat via integrate.api.nvidia.com
3
+ # Free tier: ~1000 req/day, 50+ models (Llama, DeepSeek, Nemotron, Qwen, etc.)
4
+ set -u
5
+ MODEL="meta/llama-3.3-70b-instruct"
6
+ MAX_TOKENS=2000
7
+ TEMP=0.3
8
+ PROMPT=""
9
+
10
+ while [[ $# -gt 0 ]]; do
11
+ case "$1" in
12
+ --model)
13
+ case "$2" in
14
+ llama|l70) MODEL="meta/llama-3.3-70b-instruct" ;;
15
+ nemotron) MODEL="nvidia/nemotron-4-340b-instruct" ;;
16
+ nemotron-nano) MODEL="nvidia/nemotron-3-nano-9b-v1" ;;
17
+ deepseek|r1) MODEL="deepseek-ai/deepseek-r1" ;;
18
+ qwen|coder) MODEL="qwen/qwen2.5-coder-32b-instruct" ;;
19
+ mistral) MODEL="mistralai/mistral-large-2-instruct" ;;
20
+ *) MODEL="$2" ;;
21
+ esac; shift 2 ;;
22
+ --max-tokens) MAX_TOKENS="$2"; shift 2 ;;
23
+ *) PROMPT="$*"; break ;;
24
+ esac
25
+ done
26
+ [[ -z "$PROMPT" ]] && [[ ! -t 0 ]] && PROMPT=$(cat)
27
+ [[ -z "$PROMPT" ]] && { echo "nvidia-bridge: no prompt" >&2; exit 2; }
28
+
29
+ LOG="$HOME/.surrogate/logs/nvidia-bridge.log"
30
+ mkdir -p "$(dirname "$LOG")"
31
+ set -a; source "$HOME/.hermes/.env"; set +a
32
+ echo "[$(date '+%H:%M:%S')] model=$MODEL len=${#PROMPT}" >> "$LOG"
33
+
34
+ RESPONSE=$(python3 -c "
35
+ import os
36
+ exec(open(os.path.expanduser('~/.surrogate/bin/lib/dns_fallback.py')).read())
37
+ exec(open(os.path.expanduser('~/.surrogate/bin/lib/bridge_retry.py')).read())
38
+ import json, sys
39
+ body = {
40
+ 'model': '$MODEL',
41
+ 'messages': [{'role':'user','content': sys.stdin.read()}],
42
+ 'max_tokens': $MAX_TOKENS, 'temperature': $TEMP,
43
+ 'stream': False,
44
+ }
45
+ try:
46
+ d = request_with_retry(
47
+ 'https://integrate.api.nvidia.com/v1/chat/completions',
48
+ data=json.dumps(body).encode(),
49
+ headers={'Content-Type':'application/json', 'User-Agent':'hermes-agent/1.0', 'Authorization':'Bearer '+os.environ.get('NVIDIA_API_KEY','')},
50
+ timeout=120, max_retries=4, base_delay=3.0, open_seconds=120,
51
+ )
52
+ print(d.get('choices',[{}])[0].get('message',{}).get('content',''))
53
+ except Exception as e:
54
+ print(f'nvidia-bridge error: {e}', file=sys.stderr); sys.exit(1)
55
+ " <<< "$PROMPT")
56
+ RC=$?
57
+ echo "[$(date '+%H:%M:%S')] rc=$RC bytes=${#RESPONSE}" >> "$LOG"
58
+ [[ $RC -ne 0 ]] && exit $RC
59
+ echo "$RESPONSE"
bin/perf-watchdog.sh CHANGED
@@ -5,15 +5,15 @@
5
  # - load avg 1min (kill if > 10, warn if > 7)
6
  # - memory free pages (warn if < 30k, emergency < 15k)
7
  # - swap I/O rate (emergency if spiking)
8
- # - disk space on ~/.claude/state (warn if < 2GB)
9
  # - scrape process count (cap at 30, kill oldest if exceeded)
10
  #
11
  # Actions:
12
  # - WARN: log + throttle (pause new burst triggers via state file)
13
  # - EMERGENCY: kill all scrape processes, set pause flag for 10 min
14
  set -u
15
- LOG="$HOME/.claude/logs/perf-watchdog.log"
16
- PAUSE_FLAG="$HOME/.claude/state/scrape-paused"
17
  mkdir -p "$(dirname "$LOG")" "$(dirname "$PAUSE_FLAG")"
18
 
19
  # Thresholds
 
5
  # - load avg 1min (kill if > 10, warn if > 7)
6
  # - memory free pages (warn if < 30k, emergency < 15k)
7
  # - swap I/O rate (emergency if spiking)
8
+ # - disk space on ~/.surrogate/state (warn if < 2GB)
9
  # - scrape process count (cap at 30, kill oldest if exceeded)
10
  #
11
  # Actions:
12
  # - WARN: log + throttle (pause new burst triggers via state file)
13
  # - EMERGENCY: kill all scrape processes, set pause flag for 10 min
14
  set -u
15
+ LOG="$HOME/.surrogate/logs/perf-watchdog.log"
16
+ PAUSE_FLAG="$HOME/.surrogate/state/scrape-paused"
17
  mkdir -p "$(dirname "$LOG")" "$(dirname "$PAUSE_FLAG")"
18
 
19
  # Thresholds
bin/push-training-to-hf.sh CHANGED
@@ -6,7 +6,7 @@ set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
6
 
7
  SRC="$HOME/.surrogate/training-pairs.jsonl"
8
  OFFSET_FILE="$HOME/.surrogate/.training-push-offset"
9
- LOG="$HOME/.claude/logs/training-push.log"
10
  mkdir -p "$(dirname "$LOG")"
11
 
12
  [[ ! -f "$SRC" ]] && { echo "[$(date +%H:%M:%S)] no source $SRC" | tee -a "$LOG"; exit 0; }
 
6
 
7
  SRC="$HOME/.surrogate/training-pairs.jsonl"
8
  OFFSET_FILE="$HOME/.surrogate/.training-push-offset"
9
+ LOG="$HOME/.surrogate/logs/training-push.log"
10
  mkdir -p "$(dirname "$LOG")"
11
 
12
  [[ ! -f "$SRC" ]] && { echo "[$(date +%H:%M:%S)] no source $SRC" | tee -a "$LOG"; exit 0; }
bin/qwen-coder-daemon.sh CHANGED
@@ -4,7 +4,7 @@
4
  # Pulls priority → invokes qwen-coder-worker.sh with pre-selected priority (env var).
5
  set -u
6
 
7
- LOG="$HOME/.claude/logs/qwen-coder-daemon.log"
8
  mkdir -p "$(dirname "$LOG")"
9
 
10
  # Resolve Redis: Unix socket → TCP fallback. Build a redis-cli arg array reused below.
@@ -45,7 +45,7 @@ while true; do
45
  # can't race with other workers / stale file locks.
46
  START=$(date +%s)
47
  HERMES_PRIO_ID="$PRIO_ID" \
48
- "$HOME/.claude/bin/qwen-coder-worker.sh" 2>&1 | tail -3 >> "$LOG"
49
  DUR=$(( $(date +%s) - START ))
50
  echo "[$(date '+%H:%M:%S')] $PRIO_ID done in ${DUR}s" >> "$LOG"
51
 
 
4
  # Pulls priority → invokes qwen-coder-worker.sh with pre-selected priority (env var).
5
  set -u
6
 
7
+ LOG="$HOME/.surrogate/logs/qwen-coder-daemon.log"
8
  mkdir -p "$(dirname "$LOG")"
9
 
10
  # Resolve Redis: Unix socket → TCP fallback. Build a redis-cli arg array reused below.
 
45
  # can't race with other workers / stale file locks.
46
  START=$(date +%s)
47
  HERMES_PRIO_ID="$PRIO_ID" \
48
+ "$HOME/.surrogate/bin/qwen-coder-worker.sh" 2>&1 | tail -3 >> "$LOG"
49
  DUR=$(( $(date +%s) - START ))
50
  echo "[$(date '+%H:%M:%S')] $PRIO_ID done in ${DUR}s" >> "$LOG"
51
 
bin/qwen-coder-worker.sh CHANGED
@@ -7,7 +7,7 @@
7
  # Philosophy: cheap + fast iteration — reviewer catches bad outputs.
8
  set -u
9
 
10
- LOG="$HOME/.claude/logs/qwen-coder-worker.log"
11
  OUT_DIR="$HOME/.hermes/workspace/qwen-coder"
12
  SHARED="$HOME/.hermes/workspace/swarm-shared"
13
  mkdir -p "$(dirname "$LOG")" "$OUT_DIR"
@@ -58,8 +58,8 @@ MAP_FILE="$SHARED/repo-maps/${PRIO_PROJECT}.md"
58
  # RAG: fetch real code examples from THIS project's actual codebase via FTS
59
  # Grounds the model in real APIs/imports/patterns instead of hallucinating
60
  RAG_EXAMPLES=""
61
- if [[ -x "$HOME/.claude/bin/ask-sqlite.py" ]]; then
62
- RAG_EXAMPLES=$(python3 "$HOME/.claude/bin/ask-sqlite.py" \
63
  "$PRIO_PROJECT $PRIO_TITLE" 2>/dev/null | head -c 2500)
64
  fi
65
 
 
7
  # Philosophy: cheap + fast iteration — reviewer catches bad outputs.
8
  set -u
9
 
10
+ LOG="$HOME/.surrogate/logs/qwen-coder-worker.log"
11
  OUT_DIR="$HOME/.hermes/workspace/qwen-coder"
12
  SHARED="$HOME/.hermes/workspace/swarm-shared"
13
  mkdir -p "$(dirname "$LOG")" "$OUT_DIR"
 
58
  # RAG: fetch real code examples from THIS project's actual codebase via FTS
59
  # Grounds the model in real APIs/imports/patterns instead of hallucinating
60
  RAG_EXAMPLES=""
61
+ if [[ -x "$HOME/.surrogate/bin/ask-sqlite.py" ]]; then
62
+ RAG_EXAMPLES=$(python3 "$HOME/.surrogate/bin/ask-sqlite.py" \
63
  "$PRIO_PROJECT $PRIO_TITLE" 2>/dev/null | head -c 2500)
64
  fi
65
 
bin/sambanova-bridge.sh ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # SambaNova Cloud bridge — fast Llama 3.3 70B/405B + DeepSeek-V3 free tier
3
+ # Endpoint: https://api.sambanova.ai/v1 (OpenAI-compat, ~500 tok/s)
4
+ # Key env: SAMBANOVA_API_KEY
5
+ # Usage: sambanova-bridge.sh [--model MODEL] "<prompt>"
6
+ set -u
7
+ # Default: Llama 3.3 70B — best speed (500 tok/s) × quality tradeoff on SambaNova.
8
+ # Full catalog verified 2026-04: DeepSeek-V3.1/V3.1-cb/V3.2, Llama-4-Maverick,
9
+ # gpt-oss-120b, gemma-3-12b-it, MiniMax-M2.5 (service-tier-locked).
10
+ MODEL="Meta-Llama-3.3-70B-Instruct"
11
+ MAX_TOKENS=2000
12
+ TEMP=0.3
13
+ PROMPT=""
14
+
15
+ while [[ $# -gt 0 ]]; do
16
+ case "$1" in
17
+ --model)
18
+ case "$2" in
19
+ fast|small|gemma|gemma3) MODEL="gemma-3-12b-it" ;;
20
+ llama|llama70|70b) MODEL="Meta-Llama-3.3-70B-Instruct" ;;
21
+ llama4|maverick) MODEL="Llama-4-Maverick-17B-128E-Instruct" ;;
22
+ deepseek|deepseek-v3) MODEL="DeepSeek-V3.1" ;;
23
+ deepseek-latest|v32) MODEL="DeepSeek-V3.2" ;;
24
+ deepseek-cb|cb) MODEL="DeepSeek-V3.1-cb" ;;
25
+ gpt-oss|oss|120b) MODEL="gpt-oss-120b" ;;
26
+ *) MODEL="$2" ;;
27
+ esac; shift 2 ;;
28
+ --max-tokens) MAX_TOKENS="$2"; shift 2 ;;
29
+ --temperature) TEMP="$2"; shift 2 ;;
30
+ *) PROMPT="$*"; break ;;
31
+ esac
32
+ done
33
+ [[ -z "$PROMPT" ]] && [[ ! -t 0 ]] && PROMPT=$(cat)
34
+ [[ -z "$PROMPT" ]] && { echo "sambanova-bridge: no prompt" >&2; exit 2; }
35
+
36
+ LOG="$HOME/.surrogate/logs/sambanova-bridge.log"
37
+ mkdir -p "$(dirname "$LOG")"
38
+ set -a; source "$HOME/.hermes/.env" 2>/dev/null || true; set +a
39
+
40
+ if [[ -z "${SAMBANOVA_API_KEY:-}" ]]; then
41
+ echo "sambanova-bridge: missing SAMBANOVA_API_KEY in ~/.hermes/.env" >&2
42
+ exit 3
43
+ fi
44
+
45
+ echo "[$(date '+%H:%M:%S')] model=$MODEL len=${#PROMPT}" >> "$LOG"
46
+
47
+ RESPONSE=$(python3 -c "
48
+ import os
49
+ exec(open(os.path.expanduser('~/.surrogate/bin/lib/dns_fallback.py')).read())
50
+ exec(open(os.path.expanduser('~/.surrogate/bin/lib/bridge_retry.py')).read())
51
+ import json, sys
52
+ body = {
53
+ 'model': '$MODEL',
54
+ 'messages': [{'role':'user','content': sys.stdin.read()}],
55
+ 'max_tokens': $MAX_TOKENS, 'temperature': $TEMP,
56
+ }
57
+ try:
58
+ d = request_with_retry(
59
+ 'https://api.sambanova.ai/v1/chat/completions',
60
+ data=json.dumps(body).encode(),
61
+ headers={
62
+ 'Content-Type':'application/json',
63
+ 'User-Agent':'hermes-agent/1.0',
64
+ 'Authorization':'Bearer '+os.environ.get('SAMBANOVA_API_KEY',''),
65
+ },
66
+ timeout=120, max_retries=4, base_delay=2.0,
67
+ )
68
+ print(d.get('choices',[{}])[0].get('message',{}).get('content',''))
69
+ except Exception as e:
70
+ print(f'sambanova-bridge error: {e}', file=sys.stderr); sys.exit(1)
71
+ " <<< "$PROMPT")
72
+ RC=$?
73
+ echo "[$(date '+%H:%M:%S')] rc=$RC bytes=${#RESPONSE}" >> "$LOG"
74
+ [[ $RC -ne 0 ]] && exit $RC
75
+ echo "$RESPONSE"
bin/scrape-keyword-tuner.sh CHANGED
@@ -11,7 +11,7 @@
11
  set -uo pipefail
12
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
13
 
14
- LOG="$HOME/.claude/logs/scrape-keyword-tuner.log"
15
  mkdir -p "$(dirname "$LOG")"
16
 
17
  TOKEN="${GITHUB_TOKEN_POOL%%,*}" # first non-empty
@@ -33,7 +33,7 @@ python3 <<PYEOF >> "$LOG" 2>&1
33
  import os, re, json, sqlite3, time, urllib.request, urllib.error, urllib.parse
34
 
35
  TOKEN = "$TOKEN"
36
- DB = os.path.expanduser("~/.claude/state/scrape-ledger.db")
37
 
38
  def github_count(keywords: str) -> int:
39
  """Return total_count from GitHub Search API (or -1 on error)."""
 
11
  set -uo pipefail
12
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
13
 
14
+ LOG="$HOME/.surrogate/logs/scrape-keyword-tuner.log"
15
  mkdir -p "$(dirname "$LOG")"
16
 
17
  TOKEN="${GITHUB_TOKEN_POOL%%,*}" # first non-empty
 
33
  import os, re, json, sqlite3, time, urllib.request, urllib.error, urllib.parse
34
 
35
  TOKEN = "$TOKEN"
36
+ DB = os.path.expanduser("~/.surrogate/state/scrape-ledger.db")
37
 
38
  def github_count(keywords: str) -> int:
39
  """Return total_count from GitHub Search API (or -1 on error)."""
bin/scrape-ledger-init.sh ADDED
@@ -0,0 +1,123 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Initialize global scrape ledger — single source of truth for "what's been scraped"
3
+ # All scrapers check ledger before scraping + write after.
4
+ # DB: ~/.surrogate/state/scrape-ledger.db (SQLite WAL for concurrent safety)
5
+ set -u
6
+ DB="$HOME/.surrogate/state/scrape-ledger.db"
7
+ mkdir -p "$(dirname "$DB")"
8
+
9
+ sqlite3 "$DB" <<'SQL'
10
+ PRAGMA journal_mode=WAL;
11
+ PRAGMA synchronous=NORMAL;
12
+
13
+ CREATE TABLE IF NOT EXISTS scraped (
14
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
15
+ source TEXT NOT NULL, -- 'github', 'rss', 'stackoverflow', 'fs', 'crawl4ai'
16
+ identifier TEXT NOT NULL, -- 'owner/repo' or URL or file path hash
17
+ domain TEXT, -- 'security', 'devops', 'ai-ml', 'frontend', etc.
18
+ subdomain TEXT, -- 'cve', 'kyverno', 'observability', etc.
19
+ language TEXT, -- 'python', 'go', 'terraform'
20
+ stars INTEGER DEFAULT 0,
21
+ scraped_at TEXT NOT NULL,
22
+ pairs_written INTEGER DEFAULT 0,
23
+ status TEXT DEFAULT 'ok', -- 'ok', 'err', 'skipped', 'partial'
24
+ notes TEXT
25
+ );
26
+
27
+ CREATE UNIQUE INDEX IF NOT EXISTS idx_scraped_src_id ON scraped(source, identifier);
28
+ CREATE INDEX IF NOT EXISTS idx_scraped_domain ON scraped(domain);
29
+ CREATE INDEX IF NOT EXISTS idx_scraped_ts ON scraped(scraped_at);
30
+
31
+ -- Domain taxonomy — what every enterprise software company deals with
32
+ CREATE TABLE IF NOT EXISTS domain_taxonomy (
33
+ domain TEXT PRIMARY KEY,
34
+ subdomain TEXT,
35
+ search_keywords TEXT,
36
+ priority INTEGER DEFAULT 5, -- 1=critical, 10=nice-to-have
37
+ target_repos INTEGER DEFAULT 100
38
+ );
39
+
40
+ -- Seed taxonomy
41
+ INSERT OR IGNORE INTO domain_taxonomy (domain, subdomain, search_keywords, priority, target_repos) VALUES
42
+ -- CODING (per language)
43
+ ('coding','python-framework','fastapi django flask poetry uv ruff mypy pydantic',1,150),
44
+ ('coding','python-async','asyncio aiohttp httpx anyio trio',1,80),
45
+ ('coding','typescript-framework','nextjs remix astro svelte solid react vue nuxt',1,150),
46
+ ('coding','typescript-tooling','vite tsup esbuild turbopack biome',2,80),
47
+ ('coding','go-ecosystem','gin echo fiber chi gorilla cobra viper',1,120),
48
+ ('coding','rust-ecosystem','tokio axum actix warp rocket serde clap',1,100),
49
+ ('coding','java-kotlin','spring boot ktor micronaut quarkus',2,80),
50
+ ('coding','mobile-native','swiftui jetpack compose react-native flutter',2,100),
51
+ -- SECURITY
52
+ ('security','appsec','owasp top10 cwe sast dast semgrep bandit eslint-security',1,120),
53
+ ('security','cloudsec','prowler scoutsuite cloudcustodian checkov tfsec iam-cli',1,120),
54
+ ('security','container-sec','trivy grype syft kyverno opa falco tetragon',1,100),
55
+ ('security','supply-chain','cosign sigstore slsa sbom cyclonedx in-toto',1,80),
56
+ ('security','secrets','vault sops age gitleaks trufflehog detect-secrets',1,60),
57
+ ('security','identity','keycloak authentik ory hydra dex oidc-provider',2,60),
58
+ ('security','detection','sigma mitre-attack falco-rules wazuh yara sentinelone',1,80),
59
+ ('security','offensive','metasploit nuclei gobuster ffuf burp-extensions',3,40),
60
+ -- OPS / DEVOPS / SRE
61
+ ('ops','devops-ci','github-actions gitlab-ci jenkins dagger buildkit',1,100),
62
+ ('ops','iac','terraform pulumi cdk cloudformation ansible',1,150),
63
+ ('ops','kubernetes','k8s helm kustomize argocd flux crossplane istio linkerd',1,200),
64
+ ('ops','sre','sre-book postmortem slo burn-rate chaos-engineering',1,80),
65
+ ('ops','chaos','chaos-mesh litmus gremlin chaos-toolkit',2,40),
66
+ ('ops','config-mgmt','ansible chef puppet salt',3,40),
67
+ ('observability','metrics','prometheus thanos mimir victoriametrics alertmanager',1,100),
68
+ ('observability','logs','loki elasticsearch opensearch fluentbit vector',1,80),
69
+ ('observability','traces','tempo jaeger zipkin skywalking honeycomb',1,80),
70
+ ('observability','apm','datadog newrelic dynatrace appdynamics instana',2,40),
71
+ ('observability','profiling','pyroscope parca gprofiler py-spy flamegraph',2,40),
72
+ ('observability','otel','opentelemetry-collector otel-sdk semantic-conventions',1,60),
73
+ ('observability','ebpf','cilium tetragon pixie falco inspektor-gadget',1,60),
74
+ -- CLOUD
75
+ ('cloud','aws','aws-cdk aws-samples aws-solutions aws-copilot sam',1,200),
76
+ ('cloud','gcp','gcp-samples terraform-google anthos',1,100),
77
+ ('cloud','azure','azure-samples bicep terraform-azurerm',1,100),
78
+ ('cloud','multicloud','crossplane cluster-api karpenter external-dns',2,60),
79
+ ('cloud','serverless','sam sst cdk serverless-framework workers wrangler',1,100),
80
+ ('finops','finops','kubecost opencost cloudhealth crane infracost',1,60),
81
+ -- AI / ML / AGENTS
82
+ ('ai','llm-serving','vllm tgi ollama llama.cpp exllama sglang',1,100),
83
+ ('ai','llm-training','unsloth axolotl peft trl ms-swift torchtune',1,100),
84
+ ('ai','agents','langgraph crewai autogen mcp-server dspy haystack',1,120),
85
+ ('ai','rag','llamaindex langchain colbert chroma qdrant weaviate',1,100),
86
+ ('ai','ml-frameworks','pytorch-lightning jax equinox flax transformers diffusers',2,80),
87
+ ('ai','ml-ops','mlflow wandb comet kedro zenml',2,60),
88
+ ('ai','eval','lm-evaluation-harness deepeval ragas opik',2,40),
89
+ -- DATA
90
+ ('data','databases','postgres mysql pgvector cockroachdb tidb',1,100),
91
+ ('data','streaming','kafka nats redpanda pulsar flink',1,80),
92
+ ('data','warehouses','clickhouse duckdb snowflake trino presto starrocks',1,80),
93
+ ('data','orchestration','airflow prefect dagster temporal',1,80),
94
+ ('data','formats','parquet iceberg delta-lake hudi avro',2,40),
95
+ ('data','etl','dbt meltano singer airbyte',2,40),
96
+ -- FRONTEND / UX
97
+ ('frontend','components','shadcn-ui radix headlessui mantine chakra',2,80),
98
+ ('frontend','state','zustand jotai redux-toolkit tanstack-query swr',2,60),
99
+ ('frontend','styling','tailwindcss unocss vanilla-extract stitches',2,60),
100
+ ('frontend','animations','framer-motion auto-animate gsap lottie',3,40),
101
+ -- BACKEND
102
+ ('backend','graphql','apollo relay urql hasura postgraphile',2,60),
103
+ ('backend','grpc','grpc-web buf connect-go',2,40),
104
+ ('backend','queues','bullmq sidekiq celery rq',2,60),
105
+ -- ARCHITECTURE
106
+ ('architecture','patterns','hexagonal ddd cqrs event-sourcing saga outbox',1,60),
107
+ ('architecture','messaging','cloudevents asyncapi schema-registry',2,40),
108
+ -- QUALITY / TESTING
109
+ ('quality','unit-test','pytest vitest jest junit5 testify',2,60),
110
+ ('quality','e2e','playwright cypress puppeteer selenium',2,60),
111
+ ('quality','load-test','k6 locust gatling vegeta',2,40),
112
+ ('quality','contract','pact dredd schemathesis',3,30),
113
+ -- COMPLIANCE
114
+ ('compliance','audit','pdpa gdpr soc2 iso27001 pci-dss hipaa',1,60),
115
+ ('compliance','policy-as-code','opa kyverno gatekeeper conftest',1,60),
116
+ -- PRODUCT / BUSINESS
117
+ ('product','analytics','posthog plausible amplitude mixpanel',2,40),
118
+ ('product','feature-flags','unleash flagsmith growthbook launchdarkly',2,40);
119
+
120
+ SELECT 'ledger initialized: ' || COUNT(*) || ' domains' FROM domain_taxonomy;
121
+ SQL
122
+
123
+ echo "✅ Ledger at $DB"
bin/skill-synthesis-daemon.sh CHANGED
@@ -9,7 +9,7 @@ set -uo pipefail
9
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
10
 
11
  SKILLS_DIR="$HOME/.surrogate/skills"
12
- LOG="$HOME/.claude/logs/skill-synthesis.log"
13
  PAIRS="$HOME/.surrogate/training-pairs.jsonl"
14
  mkdir -p "$SKILLS_DIR" "$(dirname "$LOG")"
15
 
 
9
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
10
 
11
  SKILLS_DIR="$HOME/.surrogate/skills"
12
+ LOG="$HOME/.surrogate/logs/skill-synthesis.log"
13
  PAIRS="$HOME/.surrogate/training-pairs.jsonl"
14
  mkdir -p "$SKILLS_DIR" "$(dirname "$LOG")"
15
 
bin/surrogate CHANGED
@@ -29,7 +29,7 @@ init_surrogate_home() {
29
  },
30
  "agents": ["architect","dev","qa","ops","reviewer"],
31
  "memory": {
32
- "episodesFile": "~/.claude/state/surrogate-memory/episodes.jsonl",
33
  "projectFiles": "~/.surrogate/projects"
34
  }
35
  }
@@ -116,7 +116,7 @@ while [[ $# -gt 0 ]]; do
116
  init) MODE="init-project"; shift ;;
117
  plan)
118
  # surrogate plan set <file> | show | clear
119
- bash ~/.claude/bin/surrogate-daemon.sh plan "$@"
120
  exit 0
121
  ;;
122
  .) shift ;;
@@ -224,7 +224,7 @@ GEMINI = os.environ.get('GEMINI_API_KEY','')
224
  GEMINI2 = os.environ.get('GEMINI_API_KEY_2','')
225
  GH_POOL = [t.strip() for t in os.environ.get('GITHUB_TOKEN_POOL','').split(',') if t.strip()]
226
 
227
- MEM_DIR = Path(os.path.expanduser('~/.claude/state/surrogate-memory'))
228
  MEM_DIR.mkdir(parents=True, exist_ok=True)
229
  EPISODES = MEM_DIR / 'episodes.jsonl'
230
 
@@ -284,7 +284,7 @@ def tool_grep(pattern, path=None, glob='*'):
284
 
285
  def tool_rag_query(query, limit=5):
286
  try:
287
- conn = sqlite3.connect(os.path.expanduser('~/.claude/index.db'))
288
  kw = ' '.join(w for w in re.sub(r'[^a-zA-Z0-9ก-๙ ]',' ',query.lower()).split() if len(w)>2)[:200]
289
  rows = conn.execute("SELECT d.source, d.path, substr(d.response,1,500) FROM docs_fts f JOIN docs d ON d.id=f.rowid WHERE f.docs_fts MATCH ? ORDER BY bm25(docs_fts) LIMIT ?", (kw,limit)).fetchall()
290
  conn.close()
@@ -476,7 +476,7 @@ ${B}Configuration${R}:
476
  ${CY}/cwd${R} <path> change working directory
477
 
478
  ${B}Diagnostics${R}:
479
- ${CY}/memory${R} show ~/.claude/state/surrogate-memory/
480
  ${CY}/cost${R} OpenRouter usage today
481
  ${CY}/cost-all${R} all provider usage breakdown
482
  ${CY}/health${R} check HF endpoint + local CLI status
@@ -605,7 +605,7 @@ repl() {
605
  *) echo "${GY}valid: plan | auto | yolo | default | acceptEdits${R}" ;;
606
  esac
607
  ;;
608
- /memory) ls -lh ~/.claude/state/surrogate-memory/ 2>&1 | head -10 ;;
609
  /undo)
610
  # Restore last checkpoint (git stash if uncommitted changes from last task)
611
  if git -C "$(pwd)" rev-parse --git-dir &>/dev/null; then
@@ -958,7 +958,7 @@ PYEOF
958
  )
959
  [[ -z "$NEXT_TASK" ]] && { echo "${GR}✅ Plan complete — all tasks done!${R}"; break; }
960
  echo "${BCY}${B}▸ Next task:${R} $NEXT_TASK"
961
- bash ~/.claude/bin/surrogate-orchestrate.sh "$NEXT_TASK"
962
  # Mark done in plan
963
  /usr/bin/python3 <<PYEOF
964
  from pathlib import Path
@@ -984,11 +984,11 @@ PYEOF
984
  if ! IFS= read -r line; then echo ""; break; fi
985
  [[ -z "$line" ]] && continue
986
  [[ "$line" == "/exit" || "$line" == "exit" ]] && break
987
- bash ~/.claude/bin/surrogate-orchestrate.sh "$line"
988
  echo ""
989
  done
990
  else
991
- bash ~/.claude/bin/surrogate-orchestrate.sh "$task"
992
  fi
993
  }
994
 
@@ -999,13 +999,13 @@ plan_mode() {
999
  echo -en "${B}${YE}▶ plan >${R} "
1000
  read -r task
1001
  fi
1002
- bash ~/.claude/bin/surrogate-orchestrate.sh --mode plan "$task"
1003
  }
1004
 
1005
  # ═══ Monitor mode (watch cloud/logs, auto-fix) ═══
1006
  monitor_mode() {
1007
  echo "${B}${MA}▶ MONITOR MODE${R}"
1008
- echo "${D} Watching ~/.claude/logs/, ~/.hermes/workspace/healer/, system load.${R}"
1009
  echo "${D} Ctrl+C to stop.${R}"
1010
  echo ""
1011
  ITER=0
@@ -1027,15 +1027,15 @@ monitor_mode() {
1027
  ls -t ~/.hermes/workspace/healer/*.md 2>/dev/null | head -3 | awk '{print " " $0}' | xargs -I{} basename {} 2>/dev/null | sed 's/^/ /'
1028
  # Training + graph
1029
  PAIRS=$(wc -l ~/axentx/surrogate/data/training-jsonl/*.jsonl 2>/dev/null | tail -1 | awk '{print $1}')
1030
- REPOS=$(sqlite3 ~/.claude/state/scrape-ledger.db "SELECT COUNT(*) FROM scraped" 2>/dev/null)
1031
  echo "${B}data${R} pairs=$PAIRS repos=$REPOS"
1032
  # Recent errors in logs (auto-heal trigger)
1033
- ERR_COUNT=$(tail -200 ~/.claude/logs/*.log 2>/dev/null | grep -cE "ERROR|Fatal|CRITICAL|429|403|500" || echo 0)
1034
  echo "${B}errors${R} last 200 log lines: $ERR_COUNT"
1035
  # If critical → spawn agent to investigate
1036
  if [[ $ERR_COUNT -gt 50 ]]; then
1037
  echo "${RE}⚠ elevated errors — dispatching investigator agent${R}"
1038
- (run_agent "เช็ค ~/.claude/logs/ หา pattern error ที่ recur บ่อย และเสนอ fix list (ห้ามแก้เอง รายงานอย่างเดียว)" 2>&1 | /usr/bin/head -20) &
1039
  fi
1040
  sleep 30
1041
  done
@@ -1045,9 +1045,9 @@ monitor_mode() {
1045
  show_status() {
1046
  banner
1047
  echo ""
1048
- REPOS=$(sqlite3 ~/.claude/state/scrape-ledger.db "SELECT COUNT(*) FROM scraped" 2>/dev/null || echo "?")
1049
  PAIRS=$(wc -l ~/axentx/surrogate/data/training-jsonl/*.jsonl 2>/dev/null | tail -1 | awk '{print $1}' || echo "?")
1050
- EP=$(wc -l ~/.claude/state/surrogate-memory/episodes.jsonl 2>/dev/null | awk '{print $1}' || echo "0")
1051
  PLAN_FILE="$SURROGATE_HOME/active-plan.md"
1052
  echo "${B}▸ Session${R}"
1053
  echo " cwd: ${GR}$(pwd)${R}"
@@ -1069,8 +1069,8 @@ show_status() {
1069
  show_agents() {
1070
  banner
1071
  echo ""
1072
- echo "${B}▸ Available agents (~/.claude/agents/)${R}"
1073
- ls ~/.claude/agents/*.md 2>/dev/null | /usr/bin/sed 's|.*/||;s|.md$||' | sed 's/^/ /'
1074
  }
1075
 
1076
  # ═══ Dispatch ═══
@@ -1086,7 +1086,7 @@ case "$MODE" in
1086
  if [[ -n "$PROMPT" ]]; then plan_mode "$PROMPT"
1087
  else
1088
  # No task — show plan status
1089
- bash ~/.claude/bin/surrogate-daemon.sh plan show
1090
  fi
1091
  ;;
1092
  print)
 
29
  },
30
  "agents": ["architect","dev","qa","ops","reviewer"],
31
  "memory": {
32
+ "episodesFile": "~/.surrogate/state/episodes.jsonl",
33
  "projectFiles": "~/.surrogate/projects"
34
  }
35
  }
 
116
  init) MODE="init-project"; shift ;;
117
  plan)
118
  # surrogate plan set <file> | show | clear
119
+ bash ~/.surrogate/bin/surrogate-daemon.sh plan "$@"
120
  exit 0
121
  ;;
122
  .) shift ;;
 
224
  GEMINI2 = os.environ.get('GEMINI_API_KEY_2','')
225
  GH_POOL = [t.strip() for t in os.environ.get('GITHUB_TOKEN_POOL','').split(',') if t.strip()]
226
 
227
+ MEM_DIR = Path(os.path.expanduser('~/.surrogate/state'))
228
  MEM_DIR.mkdir(parents=True, exist_ok=True)
229
  EPISODES = MEM_DIR / 'episodes.jsonl'
230
 
 
284
 
285
  def tool_rag_query(query, limit=5):
286
  try:
287
+ conn = sqlite3.connect(os.path.expanduser('~/.surrogate/index.db'))
288
  kw = ' '.join(w for w in re.sub(r'[^a-zA-Z0-9ก-๙ ]',' ',query.lower()).split() if len(w)>2)[:200]
289
  rows = conn.execute("SELECT d.source, d.path, substr(d.response,1,500) FROM docs_fts f JOIN docs d ON d.id=f.rowid WHERE f.docs_fts MATCH ? ORDER BY bm25(docs_fts) LIMIT ?", (kw,limit)).fetchall()
290
  conn.close()
 
476
  ${CY}/cwd${R} <path> change working directory
477
 
478
  ${B}Diagnostics${R}:
479
+ ${CY}/memory${R} show ~/.surrogate/state/
480
  ${CY}/cost${R} OpenRouter usage today
481
  ${CY}/cost-all${R} all provider usage breakdown
482
  ${CY}/health${R} check HF endpoint + local CLI status
 
605
  *) echo "${GY}valid: plan | auto | yolo | default | acceptEdits${R}" ;;
606
  esac
607
  ;;
608
+ /memory) ls -lh ~/.surrogate/state/ 2>&1 | head -10 ;;
609
  /undo)
610
  # Restore last checkpoint (git stash if uncommitted changes from last task)
611
  if git -C "$(pwd)" rev-parse --git-dir &>/dev/null; then
 
958
  )
959
  [[ -z "$NEXT_TASK" ]] && { echo "${GR}✅ Plan complete — all tasks done!${R}"; break; }
960
  echo "${BCY}${B}▸ Next task:${R} $NEXT_TASK"
961
+ bash ~/.surrogate/bin/surrogate-orchestrate.sh "$NEXT_TASK"
962
  # Mark done in plan
963
  /usr/bin/python3 <<PYEOF
964
  from pathlib import Path
 
984
  if ! IFS= read -r line; then echo ""; break; fi
985
  [[ -z "$line" ]] && continue
986
  [[ "$line" == "/exit" || "$line" == "exit" ]] && break
987
+ bash ~/.surrogate/bin/surrogate-orchestrate.sh "$line"
988
  echo ""
989
  done
990
  else
991
+ bash ~/.surrogate/bin/surrogate-orchestrate.sh "$task"
992
  fi
993
  }
994
 
 
999
  echo -en "${B}${YE}▶ plan >${R} "
1000
  read -r task
1001
  fi
1002
+ bash ~/.surrogate/bin/surrogate-orchestrate.sh --mode plan "$task"
1003
  }
1004
 
1005
  # ═══ Monitor mode (watch cloud/logs, auto-fix) ═══
1006
  monitor_mode() {
1007
  echo "${B}${MA}▶ MONITOR MODE${R}"
1008
+ echo "${D} Watching ~/.surrogate/logs/, ~/.hermes/workspace/healer/, system load.${R}"
1009
  echo "${D} Ctrl+C to stop.${R}"
1010
  echo ""
1011
  ITER=0
 
1027
  ls -t ~/.hermes/workspace/healer/*.md 2>/dev/null | head -3 | awk '{print " " $0}' | xargs -I{} basename {} 2>/dev/null | sed 's/^/ /'
1028
  # Training + graph
1029
  PAIRS=$(wc -l ~/axentx/surrogate/data/training-jsonl/*.jsonl 2>/dev/null | tail -1 | awk '{print $1}')
1030
+ REPOS=$(sqlite3 ~/.surrogate/state/scrape-ledger.db "SELECT COUNT(*) FROM scraped" 2>/dev/null)
1031
  echo "${B}data${R} pairs=$PAIRS repos=$REPOS"
1032
  # Recent errors in logs (auto-heal trigger)
1033
+ ERR_COUNT=$(tail -200 ~/.surrogate/logs/*.log 2>/dev/null | grep -cE "ERROR|Fatal|CRITICAL|429|403|500" || echo 0)
1034
  echo "${B}errors${R} last 200 log lines: $ERR_COUNT"
1035
  # If critical → spawn agent to investigate
1036
  if [[ $ERR_COUNT -gt 50 ]]; then
1037
  echo "${RE}⚠ elevated errors — dispatching investigator agent${R}"
1038
+ (run_agent "เช็ค ~/.surrogate/logs/ หา pattern error ที่ recur บ่อย และเสนอ fix list (ห้ามแก้เอง รายงานอย่างเดียว)" 2>&1 | /usr/bin/head -20) &
1039
  fi
1040
  sleep 30
1041
  done
 
1045
  show_status() {
1046
  banner
1047
  echo ""
1048
+ REPOS=$(sqlite3 ~/.surrogate/state/scrape-ledger.db "SELECT COUNT(*) FROM scraped" 2>/dev/null || echo "?")
1049
  PAIRS=$(wc -l ~/axentx/surrogate/data/training-jsonl/*.jsonl 2>/dev/null | tail -1 | awk '{print $1}' || echo "?")
1050
+ EP=$(wc -l ~/.surrogate/state/episodes.jsonl 2>/dev/null | awk '{print $1}' || echo "0")
1051
  PLAN_FILE="$SURROGATE_HOME/active-plan.md"
1052
  echo "${B}▸ Session${R}"
1053
  echo " cwd: ${GR}$(pwd)${R}"
 
1069
  show_agents() {
1070
  banner
1071
  echo ""
1072
+ echo "${B}▸ Available agents (~/.surrogate/agents/)${R}"
1073
+ ls ~/.surrogate/agents/*.md 2>/dev/null | /usr/bin/sed 's|.*/||;s|.md$||' | sed 's/^/ /'
1074
  }
1075
 
1076
  # ═══ Dispatch ═══
 
1086
  if [[ -n "$PROMPT" ]]; then plan_mode "$PROMPT"
1087
  else
1088
  # No task — show plan status
1089
+ bash ~/.surrogate/bin/surrogate-daemon.sh plan show
1090
  fi
1091
  ;;
1092
  print)
bin/surrogate-agent.sh CHANGED
@@ -33,7 +33,7 @@ while [[ $# -gt 0 ]]; do
33
  done
34
  [[ -z "$TASK" ]] && { echo "usage: $0 [--max-steps N] [--model M] <task>" >&2; exit 2; }
35
 
36
- MEM_DIR="$HOME/.claude/state/surrogate-memory"
37
  mkdir -p "$MEM_DIR"
38
 
39
  export AGENT_TASK="$TASK"
@@ -49,7 +49,7 @@ TASK = os.environ['AGENT_TASK']
49
  MAX_STEPS = int(os.environ['AGENT_MAX_STEPS'])
50
  MODEL_OVERRIDE = os.environ.get('AGENT_MODEL_OVERRIDE', '')
51
  OPENROUTER = os.environ.get('OPENROUTER_API_KEY', '')
52
- MEM_DIR = Path(os.path.expanduser('~/.claude/state/surrogate-memory'))
53
  EPISODES = MEM_DIR / 'episodes.jsonl'
54
  PATTERNS = MEM_DIR / 'patterns.jsonl'
55
  SYS_PROMPT = ''
@@ -148,7 +148,7 @@ def tool_rag_query(query, limit=5, source_filter=None):
148
  import subprocess as _sp
149
  try:
150
  # 1. BM25 via SQLite FTS
151
- conn = sqlite3.connect(os.path.expanduser('~/.claude/index.db'))
152
  kw = ' '.join(w for w in re.sub(r'[^a-zA-Z0-9ก-๙ ]', ' ', query.lower()).split() if len(w) > 2)[:200]
153
  q = "SELECT d.source, d.path, substr(d.response, 1, 500), d.id FROM docs_fts f JOIN docs d ON d.id=f.rowid WHERE f.docs_fts MATCH ?"
154
  params = [kw]
@@ -166,9 +166,9 @@ def tool_rag_query(query, limit=5, source_filter=None):
166
  dense_docs = []
167
  if len(query) > 10:
168
  try:
169
- cmd = f"""~/.claude/state/crawler-venv/bin/python -c "
170
  import chromadb, json, sys
171
- client = chromadb.PersistentClient(path='/Users/Ashira/.claude/code-vector-db')
172
  cols = client.list_collections()
173
  if cols:
174
  r = cols[0].query(query_texts=['{query[:200].replace(chr(39),chr(92)+chr(39))}'], n_results={max(limit*3,20)})
@@ -206,7 +206,7 @@ def tool_rag_code(query, limit=5):
206
  """Query code knowledge — routed through SQLite FTS (no Chroma load, crash-safe).
207
  Searches `code` + `code-vector` + `code-deep:*` sources in index.db via BM25."""
208
  try:
209
- conn = sqlite3.connect(os.path.expanduser('~/.claude/index.db'))
210
  kw = ' '.join(w for w in re.sub(r'[^a-zA-Z0-9ก-๙ ]', ' ', query.lower()).split() if len(w) > 2)[:200]
211
  rows = conn.execute("""
212
  SELECT d.source, d.path, substr(d.response, 1, 500)
@@ -222,7 +222,7 @@ def tool_rag_code(query, limit=5):
222
 
223
  def tool_web_fetch(url, timeout=45):
224
  try:
225
- cmd = f"""$HOME/.claude/state/crawler-venv/bin/python -c "
226
  import asyncio
227
  from crawl4ai import AsyncWebCrawler, BrowserConfig, CrawlerRunConfig
228
  async def f():
@@ -254,7 +254,7 @@ def tool_task(prompt, max_steps=5):
254
  sub_id = uuid.uuid4().hex[:8]
255
  print(f" ↳ [sub-agent {sub_id}] spawning: {prompt[:80]}", flush=True)
256
  try:
257
- cmd = ['bash', os.path.expanduser('~/.claude/bin/surrogate-agent.sh'),
258
  '--max-steps', str(max_steps), prompt]
259
  r = subprocess.run(cmd, capture_output=True, text=True, timeout=600)
260
  return {'sub_id': sub_id, 'output': r.stdout[-4000:], 'rc': r.returncode}
@@ -274,7 +274,7 @@ def tool_orchestrate(subtasks, pattern='parallel', max_steps=5):
274
  def run_one(prompt):
275
  try:
276
  r = subprocess.run(
277
- ['bash', os.path.expanduser('~/.claude/bin/surrogate-agent.sh'),
278
  '--max-steps', str(max_steps), prompt],
279
  capture_output=True, text=True, timeout=600
280
  )
@@ -426,7 +426,7 @@ TOOLS = {
426
  def check_budget():
427
  """Return True if under daily budget ($2/day default). Caller aborts if False."""
428
  import time as _t
429
- cache = Path(os.path.expanduser('~/.claude/state/openrouter-budget-cache.json'))
430
  # Cache balance check for 5 min (reduce API calls)
431
  try:
432
  if cache.exists() and _t.time() - cache.stat().st_mtime < 300:
@@ -439,7 +439,7 @@ def check_budget():
439
  cache.parent.mkdir(parents=True, exist_ok=True)
440
  cache.write_text(json.dumps({'usage': d.get('usage',0), 'ts': _t.time()}))
441
  # Check today's marker
442
- today_f = Path(os.path.expanduser('~/.claude/state/openrouter-today-start.txt'))
443
  today_str = datetime.now().strftime('%Y-%m-%d')
444
  if not today_f.exists() or today_f.read_text().split(':')[0] != today_str:
445
  today_f.parent.mkdir(parents=True, exist_ok=True)
 
33
  done
34
  [[ -z "$TASK" ]] && { echo "usage: $0 [--max-steps N] [--model M] <task>" >&2; exit 2; }
35
 
36
+ MEM_DIR="$HOME/.surrogate/state/surrogate-memory"
37
  mkdir -p "$MEM_DIR"
38
 
39
  export AGENT_TASK="$TASK"
 
49
  MAX_STEPS = int(os.environ['AGENT_MAX_STEPS'])
50
  MODEL_OVERRIDE = os.environ.get('AGENT_MODEL_OVERRIDE', '')
51
  OPENROUTER = os.environ.get('OPENROUTER_API_KEY', '')
52
+ MEM_DIR = Path(os.path.expanduser('~/.surrogate/state/surrogate-memory'))
53
  EPISODES = MEM_DIR / 'episodes.jsonl'
54
  PATTERNS = MEM_DIR / 'patterns.jsonl'
55
  SYS_PROMPT = ''
 
148
  import subprocess as _sp
149
  try:
150
  # 1. BM25 via SQLite FTS
151
+ conn = sqlite3.connect(os.path.expanduser('~/.surrogate/index.db'))
152
  kw = ' '.join(w for w in re.sub(r'[^a-zA-Z0-9ก-๙ ]', ' ', query.lower()).split() if len(w) > 2)[:200]
153
  q = "SELECT d.source, d.path, substr(d.response, 1, 500), d.id FROM docs_fts f JOIN docs d ON d.id=f.rowid WHERE f.docs_fts MATCH ?"
154
  params = [kw]
 
166
  dense_docs = []
167
  if len(query) > 10:
168
  try:
169
+ cmd = f"""~/.surrogate/state/crawler-venv/bin/python -c "
170
  import chromadb, json, sys
171
+ client = chromadb.PersistentClient(path='$HOME/.surrogate/code-vector-db')
172
  cols = client.list_collections()
173
  if cols:
174
  r = cols[0].query(query_texts=['{query[:200].replace(chr(39),chr(92)+chr(39))}'], n_results={max(limit*3,20)})
 
206
  """Query code knowledge — routed through SQLite FTS (no Chroma load, crash-safe).
207
  Searches `code` + `code-vector` + `code-deep:*` sources in index.db via BM25."""
208
  try:
209
+ conn = sqlite3.connect(os.path.expanduser('~/.surrogate/index.db'))
210
  kw = ' '.join(w for w in re.sub(r'[^a-zA-Z0-9ก-๙ ]', ' ', query.lower()).split() if len(w) > 2)[:200]
211
  rows = conn.execute("""
212
  SELECT d.source, d.path, substr(d.response, 1, 500)
 
222
 
223
  def tool_web_fetch(url, timeout=45):
224
  try:
225
+ cmd = f"""$HOME/.surrogate/state/crawler-venv/bin/python -c "
226
  import asyncio
227
  from crawl4ai import AsyncWebCrawler, BrowserConfig, CrawlerRunConfig
228
  async def f():
 
254
  sub_id = uuid.uuid4().hex[:8]
255
  print(f" ↳ [sub-agent {sub_id}] spawning: {prompt[:80]}", flush=True)
256
  try:
257
+ cmd = ['bash', os.path.expanduser('~/.surrogate/bin/surrogate-agent.sh'),
258
  '--max-steps', str(max_steps), prompt]
259
  r = subprocess.run(cmd, capture_output=True, text=True, timeout=600)
260
  return {'sub_id': sub_id, 'output': r.stdout[-4000:], 'rc': r.returncode}
 
274
  def run_one(prompt):
275
  try:
276
  r = subprocess.run(
277
+ ['bash', os.path.expanduser('~/.surrogate/bin/surrogate-agent.sh'),
278
  '--max-steps', str(max_steps), prompt],
279
  capture_output=True, text=True, timeout=600
280
  )
 
426
  def check_budget():
427
  """Return True if under daily budget ($2/day default). Caller aborts if False."""
428
  import time as _t
429
+ cache = Path(os.path.expanduser('~/.surrogate/state/openrouter-budget-cache.json'))
430
  # Cache balance check for 5 min (reduce API calls)
431
  try:
432
  if cache.exists() and _t.time() - cache.stat().st_mtime < 300:
 
439
  cache.parent.mkdir(parents=True, exist_ok=True)
440
  cache.write_text(json.dumps({'usage': d.get('usage',0), 'ts': _t.time()}))
441
  # Check today's marker
442
+ today_f = Path(os.path.expanduser('~/.surrogate/state/openrouter-today-start.txt'))
443
  today_str = datetime.now().strftime('%Y-%m-%d')
444
  if not today_f.exists() or today_f.read_text().split(':')[0] != today_str:
445
  today_f.parent.mkdir(parents=True, exist_ok=True)
bin/surrogate-bridge.sh ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Surrogate-1 bridge — local Ollama endpoint for the Ashira-personalized model.
3
+ # Currently uses base Qwen2.5-Coder-7B + Thai/DevSecOps SYSTEM prompt as placeholder.
4
+ # After LoRA training on RunPod, rebuild Ollama model with merged adapter.
5
+ # Model URL: http://localhost:11434 (Ollama)
6
+ set -u
7
+ MODEL="surrogate-1"
8
+ MAX_TOKENS=2000
9
+ TEMP=0.3
10
+ PROMPT=""
11
+
12
+ while [[ $# -gt 0 ]]; do
13
+ case "$1" in
14
+ --model) MODEL="$2"; shift 2 ;;
15
+ --max-tokens) MAX_TOKENS="$2"; shift 2 ;;
16
+ *) PROMPT="$*"; break ;;
17
+ esac
18
+ done
19
+ [[ -z "$PROMPT" ]] && [[ ! -t 0 ]] && PROMPT=$(cat)
20
+ [[ -z "$PROMPT" ]] && { echo "surrogate-bridge: no prompt" >&2; exit 2; }
21
+
22
+ LOG="$HOME/.surrogate/logs/surrogate-bridge.log"
23
+ mkdir -p "$(dirname "$LOG")"
24
+ echo "[$(date '+%H:%M:%S')] model=$MODEL len=${#PROMPT}" >> "$LOG"
25
+
26
+ # Ollama OpenAI-compat endpoint
27
+ RESPONSE=$(python3 -c "
28
+ import json, sys, urllib.request, urllib.error
29
+
30
+ body = {
31
+ 'model': '$MODEL',
32
+ 'messages': [{'role':'user','content': sys.stdin.read()}],
33
+ 'max_tokens': $MAX_TOKENS,
34
+ 'temperature': $TEMP,
35
+ 'stream': False,
36
+ }
37
+ req = urllib.request.Request(
38
+ 'http://localhost:11434/v1/chat/completions',
39
+ data=json.dumps(body).encode(),
40
+ headers={'Content-Type':'application/json','Authorization':'Bearer ollama'}
41
+ )
42
+ try:
43
+ with urllib.request.urlopen(req, timeout=180) as r:
44
+ d = json.load(r)
45
+ print(d.get('choices',[{}])[0].get('message',{}).get('content',''))
46
+ except Exception as e:
47
+ print(f'surrogate-bridge error: {e}', file=sys.stderr); sys.exit(1)
48
+ " <<< "$PROMPT")
49
+ RC=$?
50
+ echo "[$(date '+%H:%M:%S')] rc=$RC bytes=${#RESPONSE}" >> "$LOG"
51
+ [[ $RC -ne 0 ]] && exit $RC
52
+ echo "$RESPONSE"
bin/surrogate-consolidate.sh ADDED
@@ -0,0 +1,163 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Episode consolidation — nightly summarize episodes → patterns → Graphiti + DPO training data
3
+ #
4
+ # Input: ~/.surrogate/state/surrogate-memory/episodes.jsonl
5
+ # Output:
6
+ # 1. ~/.surrogate/state/surrogate-memory/patterns.jsonl (learned patterns)
7
+ # 2. ~/.surrogate/index.db (source='surrogate-episodes') — pattern ingested for RAG
8
+ # 3. ~/axentx/surrogate/data/training-jsonl/dpo-pairs.jsonl (user+reply for future LoRA)
9
+ # 4. FalkorDB graph (episodic → semantic bitemporal edges)
10
+ set -u
11
+ set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
12
+
13
+ MEM="$HOME/.surrogate/state/surrogate-memory"
14
+ LOG="$HOME/.surrogate/logs/surrogate-consolidate.log"
15
+ CHECKPOINT="$MEM/consolidate.checkpoint"
16
+ mkdir -p "$(dirname "$LOG")" "$MEM"
17
+
18
+ /usr/bin/python3 <<'PYEOF' 2>>"$LOG"
19
+ import json, os, sqlite3, urllib.request, hashlib, subprocess
20
+ from datetime import datetime
21
+ from pathlib import Path
22
+
23
+ MEM = Path(os.path.expanduser('~/.surrogate/state/surrogate-memory'))
24
+ EP = MEM / 'episodes.jsonl'
25
+ PAT = MEM / 'patterns.jsonl'
26
+ CKPT = MEM / 'consolidate.checkpoint'
27
+ DPO = Path(os.path.expanduser('~/axentx/surrogate/data/training-jsonl/dpo-pairs.jsonl'))
28
+ DPO.parent.mkdir(parents=True, exist_ok=True)
29
+
30
+ OR_KEY = os.environ.get('OPENROUTER_API_KEY','')
31
+
32
+ # Checkpoint: last consolidated line #
33
+ last_line = 0
34
+ if CKPT.exists():
35
+ try: last_line = int(CKPT.read_text().strip())
36
+ except: last_line = 0
37
+
38
+ if not EP.exists():
39
+ print("[consolidate] no episodes yet")
40
+ exit()
41
+
42
+ lines = EP.read_text(errors='replace').splitlines()
43
+ new_lines = lines[last_line:]
44
+ if not new_lines:
45
+ print(f"[consolidate] no new since line {last_line}")
46
+ exit()
47
+
48
+ print(f"[consolidate] processing {len(new_lines)} new episodes")
49
+
50
+ episodes = []
51
+ for line in new_lines:
52
+ try: episodes.append(json.loads(line))
53
+ except: continue
54
+
55
+ # Step 1: Append to DPO training data (for future RunPod LoRA)
56
+ with open(DPO, 'a') as f:
57
+ for ep in episodes:
58
+ if not ep.get('task') or not ep.get('final'): continue
59
+ if '[error' in ep.get('final','') or '[timeout' in ep.get('final',''): continue
60
+ pair = {
61
+ 'instruction': ep['task'][:500],
62
+ 'input': '',
63
+ 'output': ep['final'][:3000],
64
+ 'source': 'surrogate-episode',
65
+ 'timestamp': ep.get('ts', datetime.utcnow().isoformat()),
66
+ }
67
+ f.write(json.dumps(pair, ensure_ascii=False) + '\n')
68
+
69
+ # Step 2: Summarize batches → pattern (every 10 episodes)
70
+ def summarize_batch(batch):
71
+ if not OR_KEY: return None
72
+ prompt = "Below are recent Surrogate agent episodes (task + final answer). Extract 2-3 concise reusable patterns — what kind of tasks + what approaches worked. Output as bullet list. Thai OK.\n\n"
73
+ for i, ep in enumerate(batch):
74
+ prompt += f"--- Episode {i+1} ---\nTask: {ep.get('task','')[:300]}\nAnswer: {ep.get('final','')[:500]}\n\n"
75
+ body = {
76
+ 'model': 'google/gemini-2.5-flash', # cheap, good summarizer
77
+ 'messages': [{'role':'user','content': prompt[:15000]}],
78
+ 'temperature': 0.2, 'max_tokens': 600,
79
+ }
80
+ try:
81
+ req = urllib.request.Request(
82
+ 'https://openrouter.ai/api/v1/chat/completions',
83
+ data=json.dumps(body).encode(),
84
+ headers={'Content-Type':'application/json','Authorization':f'Bearer {OR_KEY}',
85
+ 'HTTP-Referer':'https://axentx.ai','X-Title':'Surrogate-Consolidate'}
86
+ )
87
+ with urllib.request.urlopen(req, timeout=60) as r:
88
+ d = json.load(r)
89
+ return d['choices'][0]['message']['content']
90
+ except Exception as e:
91
+ print(f"[consolidate] llm err: {e}")
92
+ return None
93
+
94
+ # Batch into groups of 10
95
+ patterns_added = 0
96
+ for batch_start in range(0, len(episodes), 10):
97
+ batch = episodes[batch_start:batch_start+10]
98
+ summary = summarize_batch(batch)
99
+ if not summary: continue
100
+ pattern = {
101
+ 'ts': datetime.utcnow().isoformat(),
102
+ 'episodes_range': [batch_start, batch_start+len(batch)-1],
103
+ 'pattern_summary': summary[:2000],
104
+ 'n_episodes': len(batch),
105
+ }
106
+ with open(PAT, 'a') as f:
107
+ f.write(json.dumps(pattern, ensure_ascii=False) + '\n')
108
+ patterns_added += 1
109
+
110
+ # Step 3: Ingest patterns into index.db so future RAG finds them
111
+ conn = sqlite3.connect(os.path.expanduser('~/.surrogate/index.db'))
112
+ conn.execute('PRAGMA journal_mode=WAL')
113
+ cur = conn.cursor()
114
+ if PAT.exists():
115
+ for line in PAT.read_text().splitlines()[-50:]:
116
+ try: p = json.loads(line)
117
+ except: continue
118
+ cur.execute(
119
+ "INSERT OR IGNORE INTO docs (source, project, path, topic, instruction, response, ts) VALUES (?,?,?,?,?,?,?)",
120
+ ('surrogate-episodes', 'surrogate', 'memory:pattern', 'learned-pattern',
121
+ f"pattern from {p.get('n_episodes','?')} episodes",
122
+ p.get('pattern_summary','')[:2500],
123
+ p.get('ts', datetime.utcnow().isoformat()))
124
+ )
125
+ conn.commit()
126
+ conn.close()
127
+
128
+
129
+ # Step 3b: Write patterns as graph nodes in FalkorDB (fix stagnant graph)
130
+ import subprocess
131
+ sock_r = subprocess.run(['/usr/bin/find','/var/folders','/tmp','-name','redis.socket','-type','s'], capture_output=True, text=True)
132
+ sock = sock_r.stdout.strip().split('\n')[0] if sock_r.stdout else None
133
+ if sock:
134
+ # Each pattern → Pattern node + relationships
135
+ if PAT.exists():
136
+ for line in PAT.read_text().splitlines()[-patterns_added:]:
137
+ try: p = json.loads(line)
138
+ except: continue
139
+ pid = hashlib.md5(p.get('pattern_summary','')[:200].encode()).hexdigest()[:12]
140
+ title = p.get('pattern_summary','')[:100].replace("'", "").replace(chr(10),' ')
141
+ ts = p.get('ts','')
142
+ cypher = f"MERGE (p:Pattern {{id:'{pid}'}}) SET p.title='{title}', p.ts='{ts}', p.n_episodes={p.get('n_episodes',0)}"
143
+ try:
144
+ subprocess.run(['/opt/homebrew/bin/redis-cli','-s',sock,'GRAPH.QUERY','ashira',cypher], capture_output=True, timeout=5)
145
+ except: pass
146
+ # Each episode → Episode node linked to Pattern
147
+ for ep in episodes[-20:]:
148
+ eid = hashlib.md5(ep.get('task','')[:200].encode()).hexdigest()[:12]
149
+ task = ep.get('task','')[:80].replace("'","").replace(chr(10),' ')
150
+ quality = 'success' if '[error' not in ep.get('final','') and '[timeout' not in ep.get('final','') else 'failed'
151
+ cypher = f"MERGE (e:Episode {{id:'{eid}'}}) SET e.task='{task}', e.quality='{quality}', e.ts='{ep.get('ts','')}'"
152
+ try:
153
+ subprocess.run(['/opt/homebrew/bin/redis-cli','-s',sock,'GRAPH.QUERY','ashira',cypher], capture_output=True, timeout=5)
154
+ except: pass
155
+ print('[consolidate] wrote patterns + episodes to FalkorDB')
156
+ import hashlib # make sure imported
157
+
158
+ # Update checkpoint
159
+ CKPT.write_text(str(len(lines)))
160
+ print(f"[consolidate] added {patterns_added} patterns from {len(episodes)} episodes. DPO pairs grown.")
161
+ PYEOF
162
+
163
+ echo "[$(date '+%H:%M:%S')] consolidate done" >> "$LOG"
bin/surrogate-daemon.sh CHANGED
@@ -2,7 +2,7 @@
2
  # Surrogate Daemon — continuous autonomous worker
3
  #
4
  # Architecture:
5
- # - Task queue file: ~/.claude/state/surrogate-queue.jsonl (append-only)
6
  # - Workers: N parallel (default 3)
7
  # - Pickup: instant (as soon as worker idle → pull next task)
8
  # - Self-generation: if queue empty, daemon asks itself "what should I work on?"
@@ -18,11 +18,11 @@
18
  set -u
19
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
20
 
21
- STATE="$HOME/.claude/state/surrogate-daemon"
22
  QUEUE="$STATE/queue.jsonl"
23
  DONE="$STATE/done.jsonl"
24
  PID_FILE="$STATE/daemon.pid"
25
- LOG="$HOME/.claude/logs/surrogate-daemon.log"
26
  WORKERS=1 # default 1 worker (budget-safe). User can --workers 3 for burst
27
  mkdir -p "$STATE" "$(dirname "$LOG")"
28
 
@@ -150,7 +150,7 @@ PYEOF
150
  # Every 30min: consolidation
151
  NOW_MIN=$(date +%M)
152
  if [[ "$NOW_MIN" == "15" ]] || [[ "$NOW_MIN" == "45" ]]; then
153
- "$HOME/.claude/bin/surrogate-consolidate.sh" >> "$LOG" 2>&1 &
154
  fi
155
 
156
  sleep 10
@@ -226,7 +226,7 @@ PYEOF
226
  AUTO_TASK=$(/usr/bin/python3 <<'PYEOF'
227
  import json, os, random
228
  from pathlib import Path
229
- ep = Path(os.path.expanduser('~/.claude/state/surrogate-memory/episodes.jsonl'))
230
  recent_topics = []
231
  if ep.exists():
232
  for line in ep.read_text().splitlines()[-30:]:
@@ -243,7 +243,7 @@ pool = [
243
  # B. Codebase health
244
  "อ่าน ~/axentx/ หา TODO/FIXME across projects → สร้าง fix spec",
245
  "เช็ค axentx test coverage per project → identify weakest → propose tests",
246
- "Scan ~/.claude/bin/ หา script ที่ไม่ถูกใช้ > 7 days → propose archive",
247
  "Review last 10 auto-commits → ตรวจว่า quality OK หรือไม่",
248
  # C. Knowledge quality
249
  "สำรวจ index.db หา duplicate entries → propose dedup",
@@ -305,7 +305,7 @@ PYEOF
305
  START=$(date +%s)
306
 
307
  # Execute via agent
308
- OUTPUT=$("$HOME/.claude/bin/surrogate-agent.sh" --max-steps 6 "$TASK" 2>&1 | tail -50)
309
  END=$(date +%s)
310
  DUR=$((END - START))
311
 
 
2
  # Surrogate Daemon — continuous autonomous worker
3
  #
4
  # Architecture:
5
+ # - Task queue file: ~/.surrogate/state/surrogate-queue.jsonl (append-only)
6
  # - Workers: N parallel (default 3)
7
  # - Pickup: instant (as soon as worker idle → pull next task)
8
  # - Self-generation: if queue empty, daemon asks itself "what should I work on?"
 
18
  set -u
19
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
20
 
21
+ STATE="$HOME/.surrogate/state/surrogate-daemon"
22
  QUEUE="$STATE/queue.jsonl"
23
  DONE="$STATE/done.jsonl"
24
  PID_FILE="$STATE/daemon.pid"
25
+ LOG="$HOME/.surrogate/logs/surrogate-daemon.log"
26
  WORKERS=1 # default 1 worker (budget-safe). User can --workers 3 for burst
27
  mkdir -p "$STATE" "$(dirname "$LOG")"
28
 
 
150
  # Every 30min: consolidation
151
  NOW_MIN=$(date +%M)
152
  if [[ "$NOW_MIN" == "15" ]] || [[ "$NOW_MIN" == "45" ]]; then
153
+ "$HOME/.surrogate/bin/surrogate-consolidate.sh" >> "$LOG" 2>&1 &
154
  fi
155
 
156
  sleep 10
 
226
  AUTO_TASK=$(/usr/bin/python3 <<'PYEOF'
227
  import json, os, random
228
  from pathlib import Path
229
+ ep = Path(os.path.expanduser('~/.surrogate/state/surrogate-memory/episodes.jsonl'))
230
  recent_topics = []
231
  if ep.exists():
232
  for line in ep.read_text().splitlines()[-30:]:
 
243
  # B. Codebase health
244
  "อ่าน ~/axentx/ หา TODO/FIXME across projects → สร้าง fix spec",
245
  "เช็ค axentx test coverage per project → identify weakest → propose tests",
246
+ "Scan ~/.surrogate/bin/ หา script ที่ไม่ถูกใช้ > 7 days → propose archive",
247
  "Review last 10 auto-commits → ตรวจว่า quality OK หรือไม่",
248
  # C. Knowledge quality
249
  "สำรวจ index.db หา duplicate entries → propose dedup",
 
305
  START=$(date +%s)
306
 
307
  # Execute via agent
308
+ OUTPUT=$("$HOME/.surrogate/bin/surrogate-agent.sh" --max-steps 6 "$TASK" 2>&1 | tail -50)
309
  END=$(date +%s)
310
  DUR=$((END - START))
311
 
bin/surrogate-dev-loop.sh CHANGED
@@ -16,7 +16,7 @@
16
  set -u
17
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
18
 
19
- LOG="$HOME/.claude/logs/surrogate-dev-loop.log"
20
  OUT_DIR="$HOME/.hermes/workspace/local-dev"
21
  mkdir -p "$(dirname "$LOG")" "$OUT_DIR"
22
 
@@ -28,7 +28,7 @@ SEARCH_ROOTS=(
28
  "$HOME/axentx"
29
  "$HOME/develope/DevOps"
30
  "$HOME/develope/AI"
31
- "$HOME/.claude/bin"
32
  )
33
 
34
  # ── Task generators (pick one per cycle, weighted random) ────────────────────
@@ -41,7 +41,7 @@ ROOTS = [
41
  Path.home() / 'axentx',
42
  Path.home() / 'develope/DevOps',
43
  Path.home() / 'develope/AI',
44
- Path.home() / '.claude/bin',
45
  ]
46
  ROOTS = [p for p in ROOTS if p.exists()]
47
 
 
16
  set -u
17
  set -a; source "$HOME/.hermes/.env" 2>/dev/null; set +a
18
 
19
+ LOG="$HOME/.surrogate/logs/surrogate-dev-loop.log"
20
  OUT_DIR="$HOME/.hermes/workspace/local-dev"
21
  mkdir -p "$(dirname "$LOG")" "$OUT_DIR"
22
 
 
28
  "$HOME/axentx"
29
  "$HOME/develope/DevOps"
30
  "$HOME/develope/AI"
31
+ "$HOME/.surrogate/bin"
32
  )
33
 
34
  # ── Task generators (pick one per cycle, weighted random) ────────────────────
 
41
  Path.home() / 'axentx',
42
  Path.home() / 'develope/DevOps',
43
  Path.home() / 'develope/AI',
44
+ Path.home() / '.surrogate/bin',
45
  ]
46
  ROOTS = [p for p in ROOTS if p.exists()]
47