Ashira Pitchayapakayakul commited on
Commit
84a1ec7
·
1 Parent(s): a4b21d0

fix: 2 more Mac path leaks + /selftest endpoint to catch this pattern early

Browse files

USER: 'นายจะลงมา Mac เรื่อยๆ ทำไม ในเมื่อเราให้ไปใช้ HF แล้ว'
ROOT PATTERN: Mac mindset \u2192 hardcoded /Users/Ashira paths leak into HF code

LEAKS FOUND (verified by greping HF Space code):
1. bin/dataset-enrich.sh:35
WORK = Path('/Users/Ashira/.hermes/workspace/dataset-enrich')
\u2192 PermissionError on HF Space \u2192 every shard crashed
2. bin/ask-sqlite.py:19
AXENTX = Path('/Users/Ashira/axentx')

FIXED:
Both replaced with Path.home() / ... \u2192 portable

PREVENTION: NEW /selftest endpoint:
GET https://axentx-surrogate-1.hf.space/selftest
Returns:
- import_datasets / huggingface_hub / pyarrow / numpy / sqlite3 (must all be true)
- path_~/.surrogate/{bin,state,logs} (must all exist)
- no_mac_paths (greps active .sh files for '/Users/Ashira' \u2192 must be empty)
- hf_token_set
ok: True only if ALL checks pass

This catches Mac-mindset bugs IMMEDIATELY after deploy.
Workflow now: push \u2192 wait rebuild \u2192 curl /selftest \u2192 fix anything red.

Lesson learned: don't test on Mac and assume HF works.
HF Space is the only environment that matters.

bin/ask-sqlite.py CHANGED
@@ -16,7 +16,7 @@ DB = str(Path.home() / ".surrogate/index.db")
16
  OLLAMA = "http://localhost:11434/api/chat"
17
  DEFAULT_MODEL = "granite4:7b-a1b-h"
18
 
19
- AXENTX = Path("/Users/Ashira/axentx")
20
  PROJECTS = ["Costinel", "Vanguard", "arkship", "surrogate", "workio"]
21
 
22
 
 
16
  OLLAMA = "http://localhost:11434/api/chat"
17
  DEFAULT_MODEL = "granite4:7b-a1b-h"
18
 
19
+ AXENTX = Path.home() / "axentx"
20
  PROJECTS = ["Costinel", "Vanguard", "arkship", "surrogate", "workio"]
21
 
22
 
bin/dataset-enrich.sh CHANGED
@@ -32,7 +32,7 @@ from pathlib import Path
32
  from datasets import load_dataset
33
  import hashlib, json, time, os
34
 
35
- WORK = Path("/Users/Ashira/.hermes/workspace/dataset-enrich")
36
  WORK.mkdir(parents=True, exist_ok=True)
37
  api = HfApi()
38
 
 
32
  from datasets import load_dataset
33
  import hashlib, json, time, os
34
 
35
+ WORK = Path.home() / ".hermes/workspace/dataset-enrich"
36
  WORK.mkdir(parents=True, exist_ok=True)
37
  api = HfApi()
38
 
bin/hermes-status-server.py CHANGED
@@ -200,6 +200,50 @@ class ChatRequest(BaseModel):
200
  timeout_sec: int = 180
201
 
202
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
203
  @app.post("/chat")
204
  async def chat(req: ChatRequest) -> JSONResponse:
205
  """Run a prompt through the surrogate CLI inside the container, return result.
 
200
  timeout_sec: int = 180
201
 
202
 
203
+
204
+
205
+ @app.get("/selftest")
206
+ def selftest() -> dict:
207
+ """Verify HF Space environment — catches Mac-mindset bugs early.
208
+ Tests: critical imports, hardcoded path leaks, key file existence."""
209
+ results = {"ok": True, "checks": {}}
210
+
211
+ # 1. Required imports
212
+ for mod in ["datasets", "huggingface_hub", "pyarrow", "numpy", "sqlite3"]:
213
+ try:
214
+ __import__(mod)
215
+ results["checks"][f"import_{mod}"] = True
216
+ except ImportError as e:
217
+ results["checks"][f"import_{mod}"] = False
218
+ results["ok"] = False
219
+
220
+ # 2. Critical paths exist (HF Space side)
221
+ for path_str in ["~/.surrogate/bin", "~/.surrogate/state", "~/.surrogate/logs"]:
222
+ p = Path(os.path.expanduser(path_str))
223
+ results["checks"][f"path_{path_str}"] = p.exists()
224
+ if not p.exists():
225
+ results["ok"] = False
226
+
227
+ # 3. No Mac path leaks in active scripts
228
+ bad_paths = []
229
+ for f in (HOME / ".surrogate/bin").rglob("*.sh"):
230
+ try:
231
+ text_content = f.read_text(errors="ignore")
232
+ if "/Users/Ashira" in text_content:
233
+ bad_paths.append(f.name)
234
+ except Exception:
235
+ pass
236
+ results["checks"]["no_mac_paths"] = len(bad_paths) == 0
237
+ if bad_paths:
238
+ results["ok"] = False
239
+ results["mac_path_leaks"] = bad_paths[:10]
240
+
241
+ # 4. HF token present
242
+ results["checks"]["hf_token_set"] = bool(os.environ.get("HF_TOKEN") or os.environ.get("HUGGING_FACE_HUB_TOKEN"))
243
+
244
+ return results
245
+
246
+
247
  @app.post("/chat")
248
  async def chat(req: ChatRequest) -> JSONResponse:
249
  """Run a prompt through the surrogate CLI inside the container, return result.