Spaces:
Sleeping
Add PRESENT_STATE intent + head-shake turnaround
Browse filesRight-now questions ("how are you feeling?") were fabricating answers
from autobiographical memory. PRESENT_STATE is a new intent class that
skips retrieval and uses an affect-grounded prompt branch, falling back
to "I'm not sure" when the read is ambiguous.
Head-shake or sharp head-nod (or the manual "Not quite right" button)
hits a /chat/turnaround endpoint that re-plans the last turn:
- PERSONAL turns re-retrieve excluding the rejected bucket and exact
chunk texts, with a score floor so the rephrase doesn't degrade into
off-topic memories
- PRESENT_STATE turns flip the emotional read or admit uncertainty
Head-pose detection is client-side (MediaPipe nose-tip), with a
calibrate button and live Ξx/Ξy debug readout. Both bubbles stay
visible after turnaround β rejected gets strikethrough + "rephrased"
badge, new one gets "β» turnaround".
- README.md +8 -4
- backend/api/main.py +167 -1
- backend/config/settings.py +4 -0
- backend/pipeline/intent_kind.py +28 -0
- backend/pipeline/nodes/feedback.py +2 -0
- backend/pipeline/nodes/intent.py +33 -5
- backend/pipeline/nodes/planner.py +76 -1
- backend/pipeline/nodes/retrieval.py +11 -0
- backend/pipeline/state.py +7 -4
- frontend/src/App.css +69 -0
- frontend/src/App.tsx +22 -2
- frontend/src/components/ChatPanel.tsx +152 -3
- frontend/src/components/SensingStatus.tsx +20 -0
- frontend/src/hooks/useSensing.ts +48 -2
- frontend/src/lib/api.ts +18 -1
- frontend/src/lib/sensing.ts +221 -0
- frontend/src/types.ts +23 -0
|
@@ -353,9 +353,12 @@ Heads up: all camera/sensing stuff is in the frontend (MediaPipe JS). Backend ju
|
|
| 353 |
|
| 354 |
### Sensing (frontend)
|
| 355 |
|
| 356 |
-
- [
|
| 357 |
-
- [
|
| 358 |
-
- [
|
|
|
|
|
|
|
|
|
|
| 359 |
- [ ] **[Core]** Smile / positive affect should actually change the wording (more positive lexicon), not just be metadata. Right now it's annotated in the prompt but we never checked if the LLM is doing anything with it β probably need a stronger constraint or example in the prompt
|
| 360 |
- [ ] **[Core]** Air-writing is treated as raw text appended to the query. Spec wants it as a stylistic constraint too β should it bias tone, or stay query-only? Decide and document
|
| 361 |
- [ ] **[Bonus]** Voice + air-writing conflict resolution. Capture short voice (Web Speech API), compare to air-written intent, send a `resolved_intent`
|
|
@@ -363,10 +366,11 @@ Heads up: all camera/sensing stuff is in the frontend (MediaPipe JS). Backend ju
|
|
| 363 |
|
| 364 |
### Intent decomposition
|
| 365 |
|
| 366 |
-
> Current state: regex-splits the partner query on conjunctions/punctuation into fragments, then runs each fragment through a BGE zero-shot classifier (cosine vs.
|
| 367 |
|
| 368 |
- [x] **[Core]** Personal / Contextual / Open-domain dispatch to distinct pools (personal β BGE vector store; contextual β persona memory + relevant in-session turns layered on top; open-domain β stub chunk, LLM answers from its own general knowledge β web search is intentionally out of scope).
|
| 369 |
- [x] intent node latency β split + BGE zero-shot classifier replaces the LLM router. Parallelising sub-query retrieval is still open.
|
|
|
|
| 370 |
|
| 371 |
### Retrieval
|
| 372 |
|
|
|
|
| 353 |
|
| 354 |
### Sensing (frontend)
|
| 355 |
|
| 356 |
+
- [x] **[Core]** Head-nod / sharp tilt / head-shake = "I don't like that". Different from frustrated affect.
|
| 357 |
+
- [x] frontend `HeadPoseTracker` (deadband-filtered shake + sharp-nod-with-recovery), explicit calibrate button, live Ξx/Ξy debug readout in sidebar
|
| 358 |
+
- [x] dedicated `POST /chat/turnaround` endpoint reuses cached last-state β one extra LLM call, no full pipeline re-run
|
| 359 |
+
- [x] intent-aware turnaround: PERSONAL re-retrieves excluding the rejected bucket *and* exact rejected chunk texts (with `turnaround_min_score` floor β falls back to original chunks rather than degrading); PRESENT_STATE flips emotional read or admits uncertainty
|
| 360 |
+
- [x] UI: rejected bubble gets strikethrough + "rephrased" badge, new bubble appended with "β» turnaround" badge β both visible (you can't unsay something to a partner). Manual "β» Not quite right" button as fallback
|
| 361 |
+
- [x] guards: `turnaroundConsumedTurnRef` prevents self-retrigger loops; backend `turn_id` returned in `ChatResponse` so frontend doesn't desync on persona switch; stale-turn 409
|
| 362 |
- [ ] **[Core]** Smile / positive affect should actually change the wording (more positive lexicon), not just be metadata. Right now it's annotated in the prompt but we never checked if the LLM is doing anything with it β probably need a stronger constraint or example in the prompt
|
| 363 |
- [ ] **[Core]** Air-writing is treated as raw text appended to the query. Spec wants it as a stylistic constraint too β should it bias tone, or stay query-only? Decide and document
|
| 364 |
- [ ] **[Bonus]** Voice + air-writing conflict resolution. Capture short voice (Web Speech API), compare to air-written intent, send a `resolved_intent`
|
|
|
|
| 366 |
|
| 367 |
### Intent decomposition
|
| 368 |
|
| 369 |
+
> Current state: regex-splits the partner query on conjunctions/punctuation into fragments, then runs each fragment through a BGE zero-shot classifier (cosine vs. seed exemplars per class). No LLM call, no retries. Runs in ~10β30ms per turn. Bucket hints for `PERSONAL` fragments come from a shared keyword helper in [backend/sensing/bucket_keywords.py](backend/sensing/bucket_keywords.py). Earlier versions used an LLM with Pydantic validation + 3 retries, which cost ~100s per turn on Ollama Cloud when the model emitted bad JSON.
|
| 370 |
|
| 371 |
- [x] **[Core]** Personal / Contextual / Open-domain dispatch to distinct pools (personal β BGE vector store; contextual β persona memory + relevant in-session turns layered on top; open-domain β stub chunk, LLM answers from its own general knowledge β web search is intentionally out of scope).
|
| 372 |
- [x] intent node latency β split + BGE zero-shot classifier replaces the LLM router. Parallelising sub-query retrieval is still open.
|
| 373 |
+
- [x] **[Core]** `PRESENT_STATE` intent class β questions about right-now state ("how are you feeling?", "are you tired?") used to fabricate confident answers from autobiographical memory (wrong by category, not just by wording). Now they skip retrieval entirely and the planner uses an affect-grounded prompt branch with explicit fallback to "I'm not sure" when the read is ambiguous. Margin guard demotes narrow PRESENT_STATE wins to PERSONAL (better to over-retrieve than to silently drop persona memories). Air-written supplements are classified the same way as a normal fragment β a present-tense supplement on a PRESENT_STATE query no longer flips the route to PERSONAL.
|
| 374 |
|
| 375 |
### Retrieval
|
| 376 |
|
|
@@ -2,6 +2,7 @@
|
|
| 2 |
from __future__ import annotations
|
| 3 |
|
| 4 |
import json
|
|
|
|
| 5 |
|
| 6 |
from fastapi import FastAPI, HTTPException
|
| 7 |
from fastapi.middleware.cors import CORSMiddleware
|
|
@@ -15,9 +16,12 @@ from backend.generation.llm_client import ( # active_model used by /debug/confi
|
|
| 15 |
)
|
| 16 |
from backend.guardrails.checks import check_input
|
| 17 |
from backend.pipeline.graph import run_pipeline
|
|
|
|
|
|
|
|
|
|
| 18 |
from backend.pipeline.state import PipelineState
|
| 19 |
from backend.retrieval.bucket_priors import uniform_priors
|
| 20 |
-
from backend.retrieval.vector_store import _get_embedder
|
| 21 |
|
| 22 |
app = FastAPI(
|
| 23 |
title="Multimodal AAC Chatbot API",
|
|
@@ -32,6 +36,7 @@ app.add_middleware(
|
|
| 32 |
allow_headers=["*"],
|
| 33 |
)
|
| 34 |
|
|
|
|
| 35 |
_models_ready = False
|
| 36 |
|
| 37 |
|
|
@@ -64,6 +69,13 @@ class ChatRequest(BaseModel):
|
|
| 64 |
gesture_tag: str | None = None
|
| 65 |
gaze_bucket: str | None = None
|
| 66 |
air_written_text: str | None = None
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
|
| 68 |
|
| 69 |
class EvalScoresResponse(BaseModel):
|
|
@@ -91,6 +103,7 @@ class ChatResponse(BaseModel):
|
|
| 91 |
latency: dict
|
| 92 |
guardrail_passed: bool
|
| 93 |
eval_scores: EvalScoresResponse | None = None
|
|
|
|
| 94 |
|
| 95 |
|
| 96 |
# ββ Helpers ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
|
@@ -145,6 +158,8 @@ def _build_initial_state(req: ChatRequest, session: dict) -> PipelineState:
|
|
| 145 |
gesture_tag=req.gesture_tag,
|
| 146 |
gaze_bucket=req.gaze_bucket,
|
| 147 |
air_written_text=req.air_written_text,
|
|
|
|
|
|
|
| 148 |
raw_query=req.query,
|
| 149 |
intent_route=None,
|
| 150 |
generation_config=None,
|
|
@@ -168,6 +183,58 @@ def _build_initial_state(req: ChatRequest, session: dict) -> PipelineState:
|
|
| 168 |
)
|
| 169 |
|
| 170 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 171 |
# ββ Routes βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 172 |
|
| 173 |
|
|
@@ -222,6 +289,7 @@ def chat(req: ChatRequest):
|
|
| 222 |
retrieval_mode="none",
|
| 223 |
latency={},
|
| 224 |
guardrail_passed=False,
|
|
|
|
| 225 |
)
|
| 226 |
|
| 227 |
session = _get_or_init_session(req.user_id)
|
|
@@ -232,6 +300,7 @@ def chat(req: ChatRequest):
|
|
| 232 |
# Persist updated session state
|
| 233 |
session["session_history"] = result["session_history"]
|
| 234 |
session["bucket_priors"] = result["bucket_priors"]
|
|
|
|
| 235 |
|
| 236 |
# Compute evaluation metrics
|
| 237 |
affect_emotion = (result.get("affect") or {}).get("emotion", "NEUTRAL")
|
|
@@ -256,4 +325,101 @@ def chat(req: ChatRequest):
|
|
| 256 |
latency=result.get("latency_log") or {},
|
| 257 |
guardrail_passed=result.get("guardrail_passed", True),
|
| 258 |
eval_scores=eval_scores,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 259 |
)
|
|
|
|
| 2 |
from __future__ import annotations
|
| 3 |
|
| 4 |
import json
|
| 5 |
+
import logging
|
| 6 |
|
| 7 |
from fastapi import FastAPI, HTTPException
|
| 8 |
from fastapi.middleware.cors import CORSMiddleware
|
|
|
|
| 16 |
)
|
| 17 |
from backend.guardrails.checks import check_input
|
| 18 |
from backend.pipeline.graph import run_pipeline
|
| 19 |
+
from backend.pipeline.intent_kind import classify_intent_kind
|
| 20 |
+
from backend.pipeline.nodes import feedback as feedback_node
|
| 21 |
+
from backend.pipeline.nodes import planner as planner_node
|
| 22 |
from backend.pipeline.state import PipelineState
|
| 23 |
from backend.retrieval.bucket_priors import uniform_priors
|
| 24 |
+
from backend.retrieval.vector_store import _get_embedder, retrieve
|
| 25 |
|
| 26 |
app = FastAPI(
|
| 27 |
title="Multimodal AAC Chatbot API",
|
|
|
|
| 36 |
allow_headers=["*"],
|
| 37 |
)
|
| 38 |
|
| 39 |
+
_log = logging.getLogger(__name__)
|
| 40 |
_models_ready = False
|
| 41 |
|
| 42 |
|
|
|
|
| 69 |
gesture_tag: str | None = None
|
| 70 |
gaze_bucket: str | None = None
|
| 71 |
air_written_text: str | None = None
|
| 72 |
+
head_signal: str | None = None # "HEAD_SHAKE"|"HEAD_NOD_DISSATISFIED"
|
| 73 |
+
|
| 74 |
+
|
| 75 |
+
class TurnaroundRequest(BaseModel):
|
| 76 |
+
user_id: str
|
| 77 |
+
turn_id: int | None = None # optional guard against stale turnaround calls
|
| 78 |
+
head_signal: str | None = None
|
| 79 |
|
| 80 |
|
| 81 |
class EvalScoresResponse(BaseModel):
|
|
|
|
| 103 |
latency: dict
|
| 104 |
guardrail_passed: bool
|
| 105 |
eval_scores: EvalScoresResponse | None = None
|
| 106 |
+
turn_id: int
|
| 107 |
|
| 108 |
|
| 109 |
# ββ Helpers ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
|
|
|
| 158 |
gesture_tag=req.gesture_tag,
|
| 159 |
gaze_bucket=req.gaze_bucket,
|
| 160 |
air_written_text=req.air_written_text,
|
| 161 |
+
head_signal=req.head_signal,
|
| 162 |
+
turnaround_triggered=False,
|
| 163 |
raw_query=req.query,
|
| 164 |
intent_route=None,
|
| 165 |
generation_config=None,
|
|
|
|
| 183 |
)
|
| 184 |
|
| 185 |
|
| 186 |
+
def _re_retrieve_excluding(
|
| 187 |
+
query: str,
|
| 188 |
+
user_id: str,
|
| 189 |
+
rejected_chunks: list[dict],
|
| 190 |
+
) -> list[dict] | None:
|
| 191 |
+
"""Pull fresh chunks for a turnaround, excluding the bucket and exact texts
|
| 192 |
+
of the rejected chunks.
|
| 193 |
+
|
| 194 |
+
Returns:
|
| 195 |
+
- list of chunks (passing min-score floor) when re-retrieval improved
|
| 196 |
+
on the rejected set
|
| 197 |
+
- None when re-retrieval should not be used (no signal, all dropped by
|
| 198 |
+
dedupe, or all below score floor) β caller should keep original chunks
|
| 199 |
+
"""
|
| 200 |
+
if not rejected_chunks:
|
| 201 |
+
return None
|
| 202 |
+
rejected_bucket = rejected_chunks[0].get("bucket")
|
| 203 |
+
rejected_texts = {c.get("text") for c in rejected_chunks if c.get("text")}
|
| 204 |
+
if not rejected_bucket:
|
| 205 |
+
return None
|
| 206 |
+
|
| 207 |
+
try:
|
| 208 |
+
# Pull a wider net (top_k * 2) so dedupe + bucket-exclusion still leaves
|
| 209 |
+
# enough candidates to fill rerank_k.
|
| 210 |
+
fresh = retrieve(
|
| 211 |
+
query=query,
|
| 212 |
+
user_id=user_id,
|
| 213 |
+
top_k=settings.retrieval_top_k * 2,
|
| 214 |
+
rerank_k=settings.retrieval_top_k * 2,
|
| 215 |
+
bucket_filter=None,
|
| 216 |
+
)
|
| 217 |
+
except Exception as exc:
|
| 218 |
+
_log.warning("turnaround re-retrieval failed: %r", exc)
|
| 219 |
+
return None
|
| 220 |
+
|
| 221 |
+
filtered = [
|
| 222 |
+
c
|
| 223 |
+
for c in fresh
|
| 224 |
+
if c.get("bucket") != rejected_bucket
|
| 225 |
+
and c.get("text") not in rejected_texts
|
| 226 |
+
and float(c.get("score", 0.0)) >= settings.turnaround_min_score
|
| 227 |
+
]
|
| 228 |
+
if not filtered:
|
| 229 |
+
_log.info(
|
| 230 |
+
"turnaround re-retrieval found no chunks above score floor %.2f β "
|
| 231 |
+
"keeping original chunks",
|
| 232 |
+
settings.turnaround_min_score,
|
| 233 |
+
)
|
| 234 |
+
return None
|
| 235 |
+
return filtered[: settings.retrieval_rerank_k]
|
| 236 |
+
|
| 237 |
+
|
| 238 |
# ββ Routes βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 239 |
|
| 240 |
|
|
|
|
| 289 |
retrieval_mode="none",
|
| 290 |
latency={},
|
| 291 |
guardrail_passed=False,
|
| 292 |
+
turn_id=0,
|
| 293 |
)
|
| 294 |
|
| 295 |
session = _get_or_init_session(req.user_id)
|
|
|
|
| 300 |
# Persist updated session state
|
| 301 |
session["session_history"] = result["session_history"]
|
| 302 |
session["bucket_priors"] = result["bucket_priors"]
|
| 303 |
+
session["last_state"] = result
|
| 304 |
|
| 305 |
# Compute evaluation metrics
|
| 306 |
affect_emotion = (result.get("affect") or {}).get("emotion", "NEUTRAL")
|
|
|
|
| 325 |
latency=result.get("latency_log") or {},
|
| 326 |
guardrail_passed=result.get("guardrail_passed", True),
|
| 327 |
eval_scores=eval_scores,
|
| 328 |
+
turn_id=result["turn_id"],
|
| 329 |
+
)
|
| 330 |
+
|
| 331 |
+
|
| 332 |
+
@app.post("/chat/turnaround", response_model=ChatResponse)
|
| 333 |
+
def chat_turnaround(req: TurnaroundRequest):
|
| 334 |
+
if req.user_id not in _sessions:
|
| 335 |
+
raise HTTPException(status_code=404, detail="no active session")
|
| 336 |
+
|
| 337 |
+
session = _sessions[req.user_id]
|
| 338 |
+
last: PipelineState | None = session.get("last_state")
|
| 339 |
+
if last is None:
|
| 340 |
+
raise HTTPException(status_code=409, detail="no prior turn to rephrase")
|
| 341 |
+
|
| 342 |
+
if req.turn_id is not None and req.turn_id != last["turn_id"]:
|
| 343 |
+
raise HTTPException(status_code=409, detail="stale turn_id")
|
| 344 |
+
|
| 345 |
+
# feedback.run will re-append (partner, aac_user) for this turn, so strip
|
| 346 |
+
# both of those tail entries to avoid duplicating the partner line. The
|
| 347 |
+
# rejected aac_user text is also excluded from the re-plan context this way.
|
| 348 |
+
trimmed_history = list(last.get("session_history") or [])
|
| 349 |
+
if trimmed_history and trimmed_history[-1].get("role") == "aac_user":
|
| 350 |
+
trimmed_history.pop()
|
| 351 |
+
if trimmed_history and trimmed_history[-1].get("role") == "partner":
|
| 352 |
+
trimmed_history.pop()
|
| 353 |
+
|
| 354 |
+
intent_kind = classify_intent_kind(last.get("intent_route"))
|
| 355 |
+
|
| 356 |
+
gen_cfg = dict(last.get("generation_config") or {})
|
| 357 |
+
if intent_kind == "present_state":
|
| 358 |
+
gen_cfg["persona_mod"] = "present_state_retry"
|
| 359 |
+
gen_cfg["tone_tag"] = "[TONE:HONEST_UNCERTAIN]"
|
| 360 |
+
else:
|
| 361 |
+
gen_cfg["persona_mod"] = "reverse_stance"
|
| 362 |
+
gen_cfg.setdefault("tone_tag", "[TONE:CLARIFYING_REPHRASE]")
|
| 363 |
+
|
| 364 |
+
replan_state: PipelineState = dict(last) # type: ignore[assignment]
|
| 365 |
+
replan_state["session_history"] = trimmed_history
|
| 366 |
+
replan_state["generation_config"] = gen_cfg
|
| 367 |
+
replan_state["head_signal"] = req.head_signal or last.get("head_signal")
|
| 368 |
+
replan_state["turnaround_triggered"] = True
|
| 369 |
+
replan_state["latency_log"] = {
|
| 370 |
+
"t_sensing": 0.0,
|
| 371 |
+
"t_intent": 0.0,
|
| 372 |
+
"t_retrieval": 0.0,
|
| 373 |
+
"t_generation": 0.0,
|
| 374 |
+
"t_total": 0.0,
|
| 375 |
+
}
|
| 376 |
+
|
| 377 |
+
# For PERSONAL turnarounds, pull fresh chunks excluding the bucket and
|
| 378 |
+
# exact texts of the rejected response β same chunks would just produce
|
| 379 |
+
# the same wrong answer. _re_retrieve_excluding returns None when the
|
| 380 |
+
# fresh batch is no better than what we already had, in which case we
|
| 381 |
+
# keep the original chunks rather than degrade to lower-relevance ones.
|
| 382 |
+
if intent_kind == "memory":
|
| 383 |
+
fresh_chunks = _re_retrieve_excluding(
|
| 384 |
+
query=last["raw_query"],
|
| 385 |
+
user_id=last["user_id"],
|
| 386 |
+
rejected_chunks=last.get("retrieved_chunks") or [],
|
| 387 |
+
)
|
| 388 |
+
if fresh_chunks is not None:
|
| 389 |
+
replan_state["retrieved_chunks"] = fresh_chunks
|
| 390 |
+
replan_state["retrieval_mode_used"] = "turnaround_rebucket"
|
| 391 |
+
|
| 392 |
+
planner_update = planner_node.run_primary(replan_state)
|
| 393 |
+
replan_state.update(planner_update) # type: ignore[typeddict-item]
|
| 394 |
+
|
| 395 |
+
feedback_update = feedback_node.run(replan_state)
|
| 396 |
+
replan_state.update(feedback_update) # type: ignore[typeddict-item]
|
| 397 |
+
|
| 398 |
+
session["session_history"] = replan_state["session_history"]
|
| 399 |
+
session["bucket_priors"] = replan_state["bucket_priors"]
|
| 400 |
+
session["last_state"] = replan_state
|
| 401 |
+
|
| 402 |
+
affect_emotion = (replan_state.get("affect") or {}).get("emotion", "NEUTRAL")
|
| 403 |
+
eval_scores = compute_evals(
|
| 404 |
+
response=replan_state["selected_response"] or "",
|
| 405 |
+
chunks=replan_state.get("retrieved_chunks") or [],
|
| 406 |
+
latency_log=replan_state.get("latency_log") or {},
|
| 407 |
+
affect=affect_emotion,
|
| 408 |
+
gesture_tag=replan_state.get("gesture_tag"),
|
| 409 |
+
gaze_bucket=replan_state.get("gaze_bucket"),
|
| 410 |
+
slo_target=settings.slo_target_s,
|
| 411 |
+
)
|
| 412 |
+
|
| 413 |
+
return ChatResponse(
|
| 414 |
+
user_id=req.user_id,
|
| 415 |
+
query=replan_state["raw_query"],
|
| 416 |
+
response=replan_state["selected_response"] or "",
|
| 417 |
+
affect=affect_emotion,
|
| 418 |
+
llm_tier=replan_state.get("llm_tier_used", "unknown"),
|
| 419 |
+
llm_model=replan_state.get("llm_model_used", "unknown"),
|
| 420 |
+
retrieval_mode=replan_state.get("retrieval_mode_used", "unknown"),
|
| 421 |
+
latency=replan_state.get("latency_log") or {},
|
| 422 |
+
guardrail_passed=replan_state.get("guardrail_passed", True),
|
| 423 |
+
eval_scores=eval_scores,
|
| 424 |
+
turn_id=replan_state["turn_id"],
|
| 425 |
)
|
|
@@ -20,6 +20,10 @@ class Settings(BaseSettings):
|
|
| 20 |
retrieval_top_k: int = 5
|
| 21 |
retrieval_rerank_k: int = 3
|
| 22 |
retrieval_fast_k: int = 2 # used when affect == FRUSTRATED
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
|
| 24 |
# LLM tiers β both hit Ollama Cloud via OpenAI-compatible endpoint.
|
| 25 |
# Same model on both tiers for now; swap one when a larger cloud model
|
|
|
|
| 20 |
retrieval_top_k: int = 5
|
| 21 |
retrieval_rerank_k: int = 3
|
| 22 |
retrieval_fast_k: int = 2 # used when affect == FRUSTRATED
|
| 23 |
+
# Minimum cosine score for a chunk to be used in turnaround re-retrieval.
|
| 24 |
+
# Below this, we'd rather fall back to original chunks than serve clearly
|
| 25 |
+
# off-topic memories just to "look different."
|
| 26 |
+
turnaround_min_score: float = 0.45
|
| 27 |
|
| 28 |
# LLM tiers β both hit Ollama Cloud via OpenAI-compatible endpoint.
|
| 29 |
# Same model on both tiers for now; swap one when a larger cloud model
|
|
@@ -0,0 +1,28 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Shared intent-kind classification β used by retrieval, planner, and the
|
| 2 |
+
turnaround endpoint to decide whether a turn is memory-grounded or about
|
| 3 |
+
the user's present state.
|
| 4 |
+
|
| 5 |
+
Centralized here so all three callers stay in lock-step. If you add a new
|
| 6 |
+
intent type that should bypass retrieval, this is the only file to update.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
def classify_intent_kind(route: dict | None) -> str:
|
| 11 |
+
"""Return "present_state" if every sub-intent is PRESENT_STATE; "memory" otherwise.
|
| 12 |
+
|
| 13 |
+
Mixed routes fall through to "memory" β the memory path is the safer default
|
| 14 |
+
because it still allows the model to use chunks if any are present.
|
| 15 |
+
"""
|
| 16 |
+
if not route:
|
| 17 |
+
return "memory"
|
| 18 |
+
sub_intents = route.get("sub_intents") or []
|
| 19 |
+
if not sub_intents:
|
| 20 |
+
return "memory"
|
| 21 |
+
if all(si.get("type") == "PRESENT_STATE" for si in sub_intents):
|
| 22 |
+
return "present_state"
|
| 23 |
+
return "memory"
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
def is_present_state_only(route: dict | None) -> bool:
|
| 27 |
+
"""Convenience wrapper for retrieval node β returns True iff route is purely PRESENT_STATE."""
|
| 28 |
+
return classify_intent_kind(route) == "present_state"
|
|
@@ -43,6 +43,8 @@ def _log_to_jsonl(state: PipelineState, run_id: str) -> None:
|
|
| 43 |
"llm_tier": state.get("llm_tier_used", "unknown"),
|
| 44 |
"retrieval_mode": state.get("retrieval_mode_used", "unknown"),
|
| 45 |
"affect": affect,
|
|
|
|
|
|
|
| 46 |
"guardrail_passed": state.get("guardrail_passed", True),
|
| 47 |
"num_chunks": len(chunks),
|
| 48 |
"num_personal": sum(
|
|
|
|
| 43 |
"llm_tier": state.get("llm_tier_used", "unknown"),
|
| 44 |
"retrieval_mode": state.get("retrieval_mode_used", "unknown"),
|
| 45 |
"affect": affect,
|
| 46 |
+
"head_signal": state.get("head_signal"),
|
| 47 |
+
"turnaround_triggered": state.get("turnaround_triggered", False),
|
| 48 |
"guardrail_passed": state.get("guardrail_passed", True),
|
| 49 |
"num_chunks": len(chunks),
|
| 50 |
"num_personal": sum(
|
|
@@ -19,11 +19,20 @@ from backend.sensing.bucket_keywords import infer_bucket
|
|
| 19 |
|
| 20 |
_CLASS_EXEMPLARS: dict[str, list[str]] = {
|
| 21 |
"PERSONAL": [
|
| 22 |
-
"how are you today",
|
| 23 |
"what is your favourite food",
|
| 24 |
"tell me about your family",
|
| 25 |
-
"what do you do
|
| 26 |
-
"did you
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
],
|
| 28 |
"CONTEXTUAL": [
|
| 29 |
"what did you just say",
|
|
@@ -47,6 +56,10 @@ _CLASSIFIER_THRESHOLD = (
|
|
| 47 |
_CONTEXTUAL_MARGIN_MIN = (
|
| 48 |
0.08 # CONTEXTUAL must beat runner-up by at least this β it over-matches without it
|
| 49 |
)
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
_MIN_FRAGMENT_WORDS = 3
|
| 51 |
_MAX_FRAGMENTS = 4
|
| 52 |
|
|
@@ -150,6 +163,15 @@ def _classify(fragment: str) -> str:
|
|
| 150 |
if margin < _CONTEXTUAL_MARGIN_MIN or not has_discourse_marker:
|
| 151 |
return "PERSONAL"
|
| 152 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 153 |
return best_cls
|
| 154 |
|
| 155 |
|
|
@@ -193,11 +215,17 @@ def run(state: PipelineState) -> dict:
|
|
| 193 |
|
| 194 |
air_written = state.get("air_written_text")
|
| 195 |
if air_written:
|
|
|
|
|
|
|
|
|
|
|
|
|
| 196 |
sub_intents.append(
|
| 197 |
{
|
| 198 |
-
"type":
|
| 199 |
"query": air_written,
|
| 200 |
-
"bucket_hint": infer_bucket(air_written)
|
|
|
|
|
|
|
| 201 |
"priority": priority,
|
| 202 |
}
|
| 203 |
)
|
|
|
|
| 19 |
|
| 20 |
_CLASS_EXEMPLARS: dict[str, list[str]] = {
|
| 21 |
"PERSONAL": [
|
|
|
|
| 22 |
"what is your favourite food",
|
| 23 |
"tell me about your family",
|
| 24 |
+
"what do you do for work",
|
| 25 |
+
"did you grow up around here",
|
| 26 |
+
"what was your childhood like",
|
| 27 |
+
],
|
| 28 |
+
"PRESENT_STATE": [
|
| 29 |
+
"how are you feeling today",
|
| 30 |
+
"are you tired right now",
|
| 31 |
+
"what are you doing at the moment",
|
| 32 |
+
"did you sleep well last night",
|
| 33 |
+
"are you in pain today",
|
| 34 |
+
"how is your day going",
|
| 35 |
+
"are you having a good day",
|
| 36 |
],
|
| 37 |
"CONTEXTUAL": [
|
| 38 |
"what did you just say",
|
|
|
|
| 56 |
_CONTEXTUAL_MARGIN_MIN = (
|
| 57 |
0.08 # CONTEXTUAL must beat runner-up by at least this β it over-matches without it
|
| 58 |
)
|
| 59 |
+
_PRESENT_STATE_MARGIN_MIN = (
|
| 60 |
+
0.05 # PRESENT_STATE skips retrieval, so a narrow win against PERSONAL would silently
|
| 61 |
+
# drop persona memories. Require a clear margin before going down that path.
|
| 62 |
+
)
|
| 63 |
_MIN_FRAGMENT_WORDS = 3
|
| 64 |
_MAX_FRAGMENTS = 4
|
| 65 |
|
|
|
|
| 163 |
if margin < _CONTEXTUAL_MARGIN_MIN or not has_discourse_marker:
|
| 164 |
return "PERSONAL"
|
| 165 |
|
| 166 |
+
# PRESENT_STATE skips retrieval entirely, so a narrow win over PERSONAL
|
| 167 |
+
# would silently drop persona memories with no recovery path. Demote to
|
| 168 |
+
# PERSONAL if the win isn't decisive β better to over-retrieve than to
|
| 169 |
+
# answer a personal question with no chunks.
|
| 170 |
+
if best_cls == "PRESENT_STATE":
|
| 171 |
+
margin = best_score - runner_up_score
|
| 172 |
+
if margin < _PRESENT_STATE_MARGIN_MIN:
|
| 173 |
+
return "PERSONAL"
|
| 174 |
+
|
| 175 |
return best_cls
|
| 176 |
|
| 177 |
|
|
|
|
| 215 |
|
| 216 |
air_written = state.get("air_written_text")
|
| 217 |
if air_written:
|
| 218 |
+
# Classify the air-written supplement the same way as a normal fragment
|
| 219 |
+
# so a present-tense supplement ("tired") on a present-state question
|
| 220 |
+
# doesn't silently flip the route to PERSONAL and re-enable retrieval.
|
| 221 |
+
air_cls = _classify(air_written)
|
| 222 |
sub_intents.append(
|
| 223 |
{
|
| 224 |
+
"type": air_cls,
|
| 225 |
"query": air_written,
|
| 226 |
+
"bucket_hint": infer_bucket(air_written)
|
| 227 |
+
if air_cls == "PERSONAL"
|
| 228 |
+
else None,
|
| 229 |
"priority": priority,
|
| 230 |
}
|
| 231 |
)
|
|
@@ -6,6 +6,7 @@ import time
|
|
| 6 |
from backend.config.settings import settings
|
| 7 |
from backend.generation.llm_client import active_model, chat_complete
|
| 8 |
from backend.guardrails.checks import check_output
|
|
|
|
| 9 |
from backend.pipeline.state import PipelineState
|
| 10 |
from backend.sensing.labels import GESTURE_TO_TAG
|
| 11 |
|
|
@@ -53,6 +54,11 @@ def _run(state: PipelineState, tier: str) -> dict:
|
|
| 53 |
)
|
| 54 |
gesture_tag = state.get("gesture_tag")
|
| 55 |
air_written_text = state.get("air_written_text")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 56 |
messages = _build_messages(
|
| 57 |
profile,
|
| 58 |
chunks,
|
|
@@ -62,6 +68,9 @@ def _run(state: PipelineState, tier: str) -> dict:
|
|
| 62 |
gen_cfg,
|
| 63 |
gesture_tag=gesture_tag,
|
| 64 |
air_written_text=air_written_text,
|
|
|
|
|
|
|
|
|
|
| 65 |
)
|
| 66 |
|
| 67 |
selected = chat_complete(
|
|
@@ -102,6 +111,14 @@ def _resolve_tone_tag(user_id: str, affect: str, default_tag: str) -> str:
|
|
| 102 |
return _PERSONA_TONE_OVERRIDES.get(user_id, {}).get(affect, default_tag)
|
| 103 |
|
| 104 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 105 |
def _build_messages(
|
| 106 |
profile: dict,
|
| 107 |
chunks: list[dict],
|
|
@@ -111,6 +128,9 @@ def _build_messages(
|
|
| 111 |
gen_cfg: dict,
|
| 112 |
gesture_tag: str | None = None,
|
| 113 |
air_written_text: str | None = None,
|
|
|
|
|
|
|
|
|
|
| 114 |
) -> list[dict]:
|
| 115 |
# Split into a stable system message (same per persona β gets cached by the
|
| 116 |
# provider) and a turn-specific user message. Anything that changes per
|
|
@@ -126,6 +146,9 @@ def _build_messages(
|
|
| 126 |
gesture_tag,
|
| 127 |
air_written_text,
|
| 128 |
profile["name"],
|
|
|
|
|
|
|
|
|
|
| 129 |
)
|
| 130 |
return [
|
| 131 |
{"role": "system", "content": system_content},
|
|
@@ -178,6 +201,24 @@ _PERSONA_MOD_INSTRUCTIONS = {
|
|
| 178 |
"suppress_humor": "Be direct and supportive. Suppress humor.",
|
| 179 |
"baseline": "Use your natural communication style.",
|
| 180 |
"add_confirmation": "Add a clarifying question or confirmation at the end.",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 181 |
}
|
| 182 |
|
| 183 |
|
|
@@ -190,6 +231,10 @@ def _build_user(
|
|
| 190 |
gesture_tag: str | None,
|
| 191 |
air_written_text: str | None,
|
| 192 |
persona_name: str,
|
|
|
|
|
|
|
|
|
|
|
|
|
| 193 |
) -> str:
|
| 194 |
personal_chunks = [c for c in chunks if c.get("source", "personal") == "personal"]
|
| 195 |
contextual_chunks = [c for c in chunks if c.get("source") == "contextual"]
|
|
@@ -231,8 +276,38 @@ def _build_user(
|
|
| 231 |
_PERSONA_MOD_INSTRUCTIONS["baseline"],
|
| 232 |
)
|
| 233 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 234 |
return f"""\
|
| 235 |
-
{tone_tag}{gesture_line}{air_writing_line}
|
| 236 |
{persona_instruction}
|
| 237 |
|
| 238 |
Personal memories:
|
|
|
|
| 6 |
from backend.config.settings import settings
|
| 7 |
from backend.generation.llm_client import active_model, chat_complete
|
| 8 |
from backend.guardrails.checks import check_output
|
| 9 |
+
from backend.pipeline.intent_kind import classify_intent_kind
|
| 10 |
from backend.pipeline.state import PipelineState
|
| 11 |
from backend.sensing.labels import GESTURE_TO_TAG
|
| 12 |
|
|
|
|
| 54 |
)
|
| 55 |
gesture_tag = state.get("gesture_tag")
|
| 56 |
air_written_text = state.get("air_written_text")
|
| 57 |
+
turnaround_triggered = state.get("turnaround_triggered", False)
|
| 58 |
+
rejected_response: str | None = None
|
| 59 |
+
if turnaround_triggered:
|
| 60 |
+
rejected_response = state.get("selected_response")
|
| 61 |
+
intent_kind = classify_intent_kind(state.get("intent_route"))
|
| 62 |
messages = _build_messages(
|
| 63 |
profile,
|
| 64 |
chunks,
|
|
|
|
| 68 |
gen_cfg,
|
| 69 |
gesture_tag=gesture_tag,
|
| 70 |
air_written_text=air_written_text,
|
| 71 |
+
rejected_response=rejected_response,
|
| 72 |
+
intent_kind=intent_kind,
|
| 73 |
+
affect=affect,
|
| 74 |
)
|
| 75 |
|
| 76 |
selected = chat_complete(
|
|
|
|
| 111 |
return _PERSONA_TONE_OVERRIDES.get(user_id, {}).get(affect, default_tag)
|
| 112 |
|
| 113 |
|
| 114 |
+
_AFFECT_HINTS = {
|
| 115 |
+
"HAPPY": "You currently feel positive β light, content, energetic.",
|
| 116 |
+
"FRUSTRATED": "You currently feel frustrated β tired, irritable, or off.",
|
| 117 |
+
"SURPRISED": "You currently feel surprised or caught off-guard.",
|
| 118 |
+
"NEUTRAL": "Your current state is unclear from the affect signal.",
|
| 119 |
+
}
|
| 120 |
+
|
| 121 |
+
|
| 122 |
def _build_messages(
|
| 123 |
profile: dict,
|
| 124 |
chunks: list[dict],
|
|
|
|
| 128 |
gen_cfg: dict,
|
| 129 |
gesture_tag: str | None = None,
|
| 130 |
air_written_text: str | None = None,
|
| 131 |
+
rejected_response: str | None = None,
|
| 132 |
+
intent_kind: str = "memory",
|
| 133 |
+
affect: str = "NEUTRAL",
|
| 134 |
) -> list[dict]:
|
| 135 |
# Split into a stable system message (same per persona β gets cached by the
|
| 136 |
# provider) and a turn-specific user message. Anything that changes per
|
|
|
|
| 146 |
gesture_tag,
|
| 147 |
air_written_text,
|
| 148 |
profile["name"],
|
| 149 |
+
rejected_response=rejected_response,
|
| 150 |
+
intent_kind=intent_kind,
|
| 151 |
+
affect=affect,
|
| 152 |
)
|
| 153 |
return [
|
| 154 |
{"role": "system", "content": system_content},
|
|
|
|
| 201 |
"suppress_humor": "Be direct and supportive. Suppress humor.",
|
| 202 |
"baseline": "Use your natural communication style.",
|
| 203 |
"add_confirmation": "Add a clarifying question or confirmation at the end.",
|
| 204 |
+
"turnaround": (
|
| 205 |
+
"Your previous reply missed what you actually meant. Rephrase "
|
| 206 |
+
"more directly β change the wording meaningfully, not just "
|
| 207 |
+
"surface tweaks β and end with a one-sentence clarifying "
|
| 208 |
+
"question to confirm you're on the right track."
|
| 209 |
+
),
|
| 210 |
+
"reverse_stance": (
|
| 211 |
+
"Your previous reply was substantively wrong β not poorly worded, "
|
| 212 |
+
"but the wrong content. Take a meaningfully different stance using "
|
| 213 |
+
"the available memories or, if none fit, honestly say you don't "
|
| 214 |
+
"know. Do NOT just reword the previous reply."
|
| 215 |
+
),
|
| 216 |
+
"present_state_retry": (
|
| 217 |
+
"Your previous reply was wrong about your current state. The "
|
| 218 |
+
"affect signal probably misled you. Either flip the emotional "
|
| 219 |
+
"read (if you said 'good', try 'not great') or honestly admit "
|
| 220 |
+
"you're not sure how you feel right now. Do NOT invent details."
|
| 221 |
+
),
|
| 222 |
}
|
| 223 |
|
| 224 |
|
|
|
|
| 231 |
gesture_tag: str | None,
|
| 232 |
air_written_text: str | None,
|
| 233 |
persona_name: str,
|
| 234 |
+
*,
|
| 235 |
+
rejected_response: str | None = None,
|
| 236 |
+
intent_kind: str = "memory",
|
| 237 |
+
affect: str = "NEUTRAL",
|
| 238 |
) -> str:
|
| 239 |
personal_chunks = [c for c in chunks if c.get("source", "personal") == "personal"]
|
| 240 |
contextual_chunks = [c for c in chunks if c.get("source") == "contextual"]
|
|
|
|
| 276 |
_PERSONA_MOD_INSTRUCTIONS["baseline"],
|
| 277 |
)
|
| 278 |
|
| 279 |
+
turnaround_line = ""
|
| 280 |
+
if rejected_response:
|
| 281 |
+
safe_rejected = rejected_response.replace('"', "'").replace("\n", " ")[:300]
|
| 282 |
+
turnaround_line = (
|
| 283 |
+
f"\nYour previous reply (which you need to replace, not repeat): "
|
| 284 |
+
f'"{safe_rejected}"'
|
| 285 |
+
)
|
| 286 |
+
|
| 287 |
+
if intent_kind == "present_state":
|
| 288 |
+
affect_hint = _AFFECT_HINTS.get(affect, _AFFECT_HINTS["NEUTRAL"])
|
| 289 |
+
return f"""\
|
| 290 |
+
{tone_tag}{gesture_line}{air_writing_line}{turnaround_line}
|
| 291 |
+
{persona_instruction}
|
| 292 |
+
|
| 293 |
+
The partner is asking about your present state (right now, today).
|
| 294 |
+
Your autobiographical memories do NOT contain this β do not fabricate details from them.
|
| 295 |
+
|
| 296 |
+
Current affect read: {affect}
|
| 297 |
+
{affect_hint}
|
| 298 |
+
|
| 299 |
+
Recent conversation:
|
| 300 |
+
{history_block}
|
| 301 |
+
|
| 302 |
+
Partner just said: {query}
|
| 303 |
+
|
| 304 |
+
Reply as {persona_name} in 1β2 sentences, first person.
|
| 305 |
+
- Ground the answer in the affect read above and recent conversation only.
|
| 306 |
+
- If the affect read is NEUTRAL or doesn't match what you'd say, it's better to say "I'm not sure" or "honestly, I don't really know right now" than to invent.
|
| 307 |
+
- Do NOT use autobiographical facts (job, family, hobbies) unless the partner asked."""
|
| 308 |
+
|
| 309 |
return f"""\
|
| 310 |
+
{tone_tag}{gesture_line}{air_writing_line}{turnaround_line}
|
| 311 |
{persona_instruction}
|
| 312 |
|
| 313 |
Personal memories:
|
|
@@ -4,6 +4,7 @@ from __future__ import annotations
|
|
| 4 |
import time
|
| 5 |
|
| 6 |
from backend.config.settings import settings
|
|
|
|
| 7 |
from backend.pipeline.state import PipelineState, RetrievedChunk, SubIntent
|
| 8 |
from backend.retrieval.contextual import retrieve_from_history
|
| 9 |
from backend.retrieval.vector_store import retrieve
|
|
@@ -16,6 +17,8 @@ _OPEN_DOMAIN_STUB_TEXT = (
|
|
| 16 |
def run_fast(state: PipelineState) -> dict:
|
| 17 |
"""Fast retrieval path for FRUSTRATED affect (k=2, no reranker)."""
|
| 18 |
t0 = time.perf_counter()
|
|
|
|
|
|
|
| 19 |
chunks = _dispatch_all(state, per_intent_k=settings.retrieval_fast_k)
|
| 20 |
return _build_return(state, chunks, "fast", t0)
|
| 21 |
|
|
@@ -23,6 +26,8 @@ def run_fast(state: PipelineState) -> dict:
|
|
| 23 |
def run_full(state: PipelineState) -> dict:
|
| 24 |
"""Full retrieval path: top_k cosine matches narrowed to rerank_k."""
|
| 25 |
t0 = time.perf_counter()
|
|
|
|
|
|
|
| 26 |
chunks = _dispatch_all(state, per_intent_k=settings.retrieval_rerank_k)
|
| 27 |
return _build_return(state, chunks, "full", t0)
|
| 28 |
|
|
@@ -50,6 +55,12 @@ def _dispatch_all(state: PipelineState, per_intent_k: int) -> list[RetrievedChun
|
|
| 50 |
merged.extend(_retrieve_contextual(sub, state, per_intent_k))
|
| 51 |
elif kind == "OPEN_DOMAIN":
|
| 52 |
merged.extend(_retrieve_open_domain(sub))
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 53 |
else:
|
| 54 |
merged.extend(_retrieve_personal(sub, state, per_intent_k))
|
| 55 |
|
|
|
|
| 4 |
import time
|
| 5 |
|
| 6 |
from backend.config.settings import settings
|
| 7 |
+
from backend.pipeline.intent_kind import is_present_state_only
|
| 8 |
from backend.pipeline.state import PipelineState, RetrievedChunk, SubIntent
|
| 9 |
from backend.retrieval.contextual import retrieve_from_history
|
| 10 |
from backend.retrieval.vector_store import retrieve
|
|
|
|
| 17 |
def run_fast(state: PipelineState) -> dict:
|
| 18 |
"""Fast retrieval path for FRUSTRATED affect (k=2, no reranker)."""
|
| 19 |
t0 = time.perf_counter()
|
| 20 |
+
if is_present_state_only(state.get("intent_route")):
|
| 21 |
+
return _build_return(state, [], "skipped_present_state", t0)
|
| 22 |
chunks = _dispatch_all(state, per_intent_k=settings.retrieval_fast_k)
|
| 23 |
return _build_return(state, chunks, "fast", t0)
|
| 24 |
|
|
|
|
| 26 |
def run_full(state: PipelineState) -> dict:
|
| 27 |
"""Full retrieval path: top_k cosine matches narrowed to rerank_k."""
|
| 28 |
t0 = time.perf_counter()
|
| 29 |
+
if is_present_state_only(state.get("intent_route")):
|
| 30 |
+
return _build_return(state, [], "skipped_present_state", t0)
|
| 31 |
chunks = _dispatch_all(state, per_intent_k=settings.retrieval_rerank_k)
|
| 32 |
return _build_return(state, chunks, "full", t0)
|
| 33 |
|
|
|
|
| 55 |
merged.extend(_retrieve_contextual(sub, state, per_intent_k))
|
| 56 |
elif kind == "OPEN_DOMAIN":
|
| 57 |
merged.extend(_retrieve_open_domain(sub))
|
| 58 |
+
elif kind == "PRESENT_STATE":
|
| 59 |
+
# PRESENT_STATE is grounded in the affect signal, not memory.
|
| 60 |
+
# In a pure-present-state route the run_fast/run_full early skip
|
| 61 |
+
# already short-circuits us; in a mixed route we just contribute
|
| 62 |
+
# nothing here so the planner doesn't see misleading chunks.
|
| 63 |
+
continue
|
| 64 |
else:
|
| 65 |
merged.extend(_retrieve_personal(sub, state, per_intent_k))
|
| 66 |
|
|
@@ -31,7 +31,7 @@ class RetrievedChunk(TypedDict):
|
|
| 31 |
|
| 32 |
|
| 33 |
class SubIntent(TypedDict):
|
| 34 |
-
type: str # "PERSONAL" | "CONTEXTUAL" | "OPEN_DOMAIN"
|
| 35 |
query: str
|
| 36 |
bucket_hint: str | None
|
| 37 |
priority: str # "fast" | "normal"
|
|
@@ -47,9 +47,10 @@ class GenerationConfig(TypedDict):
|
|
| 47 |
max_tokens: int
|
| 48 |
tone_tag: str # e.g. "[TONE:WITTY_SARCASTIC]"
|
| 49 |
retrieval_mode: str # "fast" | "full"
|
| 50 |
-
persona_mod:
|
| 51 |
-
|
| 52 |
-
|
|
|
|
| 53 |
|
| 54 |
|
| 55 |
class LatencyLog(TypedDict):
|
|
@@ -75,6 +76,8 @@ class PipelineState(TypedDict):
|
|
| 75 |
gesture_tag: str | None # e.g. "THUMBS_UP"
|
| 76 |
gaze_bucket: str | None # bucket hinted by gaze fixation
|
| 77 |
air_written_text: str | None # concatenated air-written chars
|
|
|
|
|
|
|
| 78 |
|
| 79 |
# ββ L2: Intent decomposition outputs βββββββββββββββββββββββββββββββββββββ
|
| 80 |
raw_query: str # partner's typed/spoken query
|
|
|
|
| 31 |
|
| 32 |
|
| 33 |
class SubIntent(TypedDict):
|
| 34 |
+
type: str # "PERSONAL" | "CONTEXTUAL" | "PRESENT_STATE" | "OPEN_DOMAIN"
|
| 35 |
query: str
|
| 36 |
bucket_hint: str | None
|
| 37 |
priority: str # "fast" | "normal"
|
|
|
|
| 47 |
max_tokens: int
|
| 48 |
tone_tag: str # e.g. "[TONE:WITTY_SARCASTIC]"
|
| 49 |
retrieval_mode: str # "fast" | "full"
|
| 50 |
+
persona_mod: str
|
| 51 |
+
# persona_mod values:
|
| 52 |
+
# "amplify_quirks" | "suppress_humor" | "baseline"
|
| 53 |
+
# | "add_confirmation" | "turnaround"
|
| 54 |
|
| 55 |
|
| 56 |
class LatencyLog(TypedDict):
|
|
|
|
| 76 |
gesture_tag: str | None # e.g. "THUMBS_UP"
|
| 77 |
gaze_bucket: str | None # bucket hinted by gaze fixation
|
| 78 |
air_written_text: str | None # concatenated air-written chars
|
| 79 |
+
head_signal: str | None # "HEAD_SHAKE" | "HEAD_NOD_DISSATISFIED"
|
| 80 |
+
turnaround_triggered: bool # true when re-planned from dissatisfaction signal
|
| 81 |
|
| 82 |
# ββ L2: Intent decomposition outputs βββββββββββββββββββββββββββββββββββββ
|
| 83 |
raw_query: str # partner's typed/spoken query
|
|
@@ -343,6 +343,75 @@ input[type="text"]:hover {
|
|
| 343 |
font-size: 13px;
|
| 344 |
}
|
| 345 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 346 |
.eval-panel {
|
| 347 |
margin-top: 10px;
|
| 348 |
border-top: 1px solid var(--border);
|
|
|
|
| 343 |
font-size: 13px;
|
| 344 |
}
|
| 345 |
|
| 346 |
+
.sensing-debug {
|
| 347 |
+
font-family: var(--mono, monospace);
|
| 348 |
+
font-size: 11px;
|
| 349 |
+
opacity: 0.7;
|
| 350 |
+
}
|
| 351 |
+
|
| 352 |
+
.calibrate-btn {
|
| 353 |
+
margin-top: 8px;
|
| 354 |
+
width: 100%;
|
| 355 |
+
padding: 8px 12px;
|
| 356 |
+
font-size: 13px;
|
| 357 |
+
font-family: var(--sans);
|
| 358 |
+
background: transparent;
|
| 359 |
+
color: var(--accent);
|
| 360 |
+
border: 1px solid var(--accent);
|
| 361 |
+
border-radius: 8px;
|
| 362 |
+
cursor: pointer;
|
| 363 |
+
}
|
| 364 |
+
|
| 365 |
+
.calibrate-btn:disabled {
|
| 366 |
+
opacity: 0.45;
|
| 367 |
+
cursor: not-allowed;
|
| 368 |
+
}
|
| 369 |
+
|
| 370 |
+
.calibrate-btn:hover:not(:disabled) {
|
| 371 |
+
background: var(--accent);
|
| 372 |
+
color: #ffffff;
|
| 373 |
+
}
|
| 374 |
+
|
| 375 |
+
.chat-bubble.rephrased p {
|
| 376 |
+
opacity: 0.6;
|
| 377 |
+
text-decoration: line-through;
|
| 378 |
+
}
|
| 379 |
+
|
| 380 |
+
.chat-bubble.turnaround {
|
| 381 |
+
border-left: 3px solid var(--accent);
|
| 382 |
+
}
|
| 383 |
+
|
| 384 |
+
.badge {
|
| 385 |
+
display: inline-block;
|
| 386 |
+
margin-left: 6px;
|
| 387 |
+
padding: 1px 6px;
|
| 388 |
+
font-size: 11px;
|
| 389 |
+
font-weight: 500;
|
| 390 |
+
border-radius: 10px;
|
| 391 |
+
vertical-align: middle;
|
| 392 |
+
}
|
| 393 |
+
|
| 394 |
+
.badge-rephrased {
|
| 395 |
+
background: rgba(0, 0, 0, 0.08);
|
| 396 |
+
color: var(--fail);
|
| 397 |
+
}
|
| 398 |
+
|
| 399 |
+
.badge-turnaround {
|
| 400 |
+
background: var(--accent);
|
| 401 |
+
color: #ffffff;
|
| 402 |
+
}
|
| 403 |
+
|
| 404 |
+
.turnaround-btn {
|
| 405 |
+
background: transparent !important;
|
| 406 |
+
color: var(--accent) !important;
|
| 407 |
+
border: 1px solid var(--accent) !important;
|
| 408 |
+
}
|
| 409 |
+
|
| 410 |
+
.turnaround-btn:hover:not(:disabled) {
|
| 411 |
+
background: var(--accent) !important;
|
| 412 |
+
color: #ffffff !important;
|
| 413 |
+
}
|
| 414 |
+
|
| 415 |
.eval-panel {
|
| 416 |
margin-top: 10px;
|
| 417 |
border-top: 1px solid var(--border);
|
|
@@ -32,8 +32,17 @@ function App() {
|
|
| 32 |
return () => clearInterval(healthPoll.current);
|
| 33 |
}, []);
|
| 34 |
|
| 35 |
-
const {
|
| 36 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
|
| 38 |
const onFrame = useCallback(
|
| 39 |
(video: HTMLVideoElement, timestamp: number) => {
|
|
@@ -89,6 +98,16 @@ function App() {
|
|
| 89 |
</label>
|
| 90 |
<WebcamSensing videoRef={videoRef} active={active} error={error || initError} />
|
| 91 |
<SensingStatus sensing={sensing} webcamActive={active} />
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 92 |
</div>
|
| 93 |
|
| 94 |
<div className="sidebar-section">
|
|
@@ -120,6 +139,7 @@ function App() {
|
|
| 120 |
sensing={sensing}
|
| 121 |
affectOverride={affectOverride}
|
| 122 |
onAirTextConsumed={clearAirWrittenText}
|
|
|
|
| 123 |
messages={messages}
|
| 124 |
setMessages={setMessages}
|
| 125 |
onLatency={setLatency}
|
|
|
|
| 32 |
return () => clearInterval(healthPoll.current);
|
| 33 |
}, []);
|
| 34 |
|
| 35 |
+
const {
|
| 36 |
+
sensing,
|
| 37 |
+
ready,
|
| 38 |
+
initError,
|
| 39 |
+
init,
|
| 40 |
+
processFrame,
|
| 41 |
+
clearAirWrittenText,
|
| 42 |
+
clearHeadSignal,
|
| 43 |
+
calibrateHeadPose,
|
| 44 |
+
resetCalibration,
|
| 45 |
+
} = useSensing();
|
| 46 |
|
| 47 |
const onFrame = useCallback(
|
| 48 |
(video: HTMLVideoElement, timestamp: number) => {
|
|
|
|
| 98 |
</label>
|
| 99 |
<WebcamSensing videoRef={videoRef} active={active} error={error || initError} />
|
| 100 |
<SensingStatus sensing={sensing} webcamActive={active} />
|
| 101 |
+
<button
|
| 102 |
+
type="button"
|
| 103 |
+
className="calibrate-btn"
|
| 104 |
+
disabled={!active}
|
| 105 |
+
onClick={() => calibrateHeadPose()}
|
| 106 |
+
>
|
| 107 |
+
{sensing.headCalibrated
|
| 108 |
+
? "Re-calibrate head pose"
|
| 109 |
+
: "Calibrate head pose"}
|
| 110 |
+
</button>
|
| 111 |
</div>
|
| 112 |
|
| 113 |
<div className="sidebar-section">
|
|
|
|
| 139 |
sensing={sensing}
|
| 140 |
affectOverride={affectOverride}
|
| 141 |
onAirTextConsumed={clearAirWrittenText}
|
| 142 |
+
onHeadSignalConsumed={clearHeadSignal}
|
| 143 |
messages={messages}
|
| 144 |
setMessages={setMessages}
|
| 145 |
onLatency={setLatency}
|
|
@@ -1,6 +1,6 @@
|
|
| 1 |
-
import { useState, useRef, useEffect } from "react";
|
| 2 |
import type { ChatMessage, SensingState, Affect, LatencyLog } from "../types";
|
| 3 |
-
import { sendChat } from "../lib/api";
|
| 4 |
import { EvalPanel } from "./EvalPanel";
|
| 5 |
|
| 6 |
interface Props {
|
|
@@ -9,18 +9,22 @@ interface Props {
|
|
| 9 |
sensing: SensingState;
|
| 10 |
affectOverride: Affect | null;
|
| 11 |
onAirTextConsumed: () => void;
|
|
|
|
| 12 |
messages: ChatMessage[];
|
| 13 |
setMessages: React.Dispatch<React.SetStateAction<ChatMessage[]>>;
|
| 14 |
onLatency: (latency: LatencyLog) => void;
|
| 15 |
backendReady: boolean;
|
| 16 |
}
|
| 17 |
|
|
|
|
|
|
|
| 18 |
export function ChatPanel({
|
| 19 |
userId,
|
| 20 |
personaName,
|
| 21 |
sensing,
|
| 22 |
affectOverride,
|
| 23 |
onAirTextConsumed,
|
|
|
|
| 24 |
messages,
|
| 25 |
setMessages,
|
| 26 |
onLatency,
|
|
@@ -28,12 +32,120 @@ export function ChatPanel({
|
|
| 28 |
}: Props) {
|
| 29 |
const [input, setInput] = useState("");
|
| 30 |
const [loading, setLoading] = useState(false);
|
|
|
|
| 31 |
const bottomRef = useRef<HTMLDivElement>(null);
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
|
| 33 |
useEffect(() => {
|
| 34 |
bottomRef.current?.scrollIntoView({ behavior: "smooth" });
|
| 35 |
}, [messages]);
|
| 36 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
async function handleSend() {
|
| 38 |
if (!input.trim() || !userId || !backendReady || loading) return;
|
| 39 |
|
|
@@ -51,8 +163,10 @@ export function ChatPanel({
|
|
| 51 |
gesture_tag: sensing.gestureTag,
|
| 52 |
gaze_bucket: sensing.gazeBucket,
|
| 53 |
air_written_text: airText,
|
|
|
|
| 54 |
});
|
| 55 |
|
|
|
|
| 56 |
setMessages((prev) => [
|
| 57 |
...prev,
|
| 58 |
{
|
|
@@ -61,9 +175,11 @@ export function ChatPanel({
|
|
| 61 |
latency: res.latency,
|
| 62 |
affect: res.affect,
|
| 63 |
evalScores: res.eval_scores,
|
|
|
|
| 64 |
},
|
| 65 |
]);
|
| 66 |
onLatency(res.latency);
|
|
|
|
| 67 |
} catch (e) {
|
| 68 |
setMessages((prev) => [
|
| 69 |
...prev,
|
|
@@ -78,6 +194,13 @@ export function ChatPanel({
|
|
| 78 |
}
|
| 79 |
}
|
| 80 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 81 |
return (
|
| 82 |
<div className="chat-panel">
|
| 83 |
<div className="chat-header">
|
|
@@ -85,9 +208,20 @@ export function ChatPanel({
|
|
| 85 |
</div>
|
| 86 |
<div className="chat-messages">
|
| 87 |
{messages.map((msg, i) => (
|
| 88 |
-
<div
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 89 |
<span className="chat-role">
|
| 90 |
{msg.role === "partner" ? "Partner" : "AAC User"}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 91 |
</span>
|
| 92 |
<p>{msg.content}</p>
|
| 93 |
{msg.role === "aac_user" && msg.evalScores && (
|
|
@@ -101,6 +235,12 @@ export function ChatPanel({
|
|
| 101 |
<p>Generating...</p>
|
| 102 |
</div>
|
| 103 |
)}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 104 |
<div ref={bottomRef} />
|
| 105 |
</div>
|
| 106 |
<div className="chat-input-row">
|
|
@@ -115,6 +255,15 @@ export function ChatPanel({
|
|
| 115 |
<button onClick={handleSend} disabled={!userId || loading || !backendReady || !input.trim()}>
|
| 116 |
Send
|
| 117 |
</button>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 118 |
</div>
|
| 119 |
</div>
|
| 120 |
);
|
|
|
|
| 1 |
+
import { useState, useRef, useEffect, useCallback } from "react";
|
| 2 |
import type { ChatMessage, SensingState, Affect, LatencyLog } from "../types";
|
| 3 |
+
import { sendChat, sendTurnaround } from "../lib/api";
|
| 4 |
import { EvalPanel } from "./EvalPanel";
|
| 5 |
|
| 6 |
interface Props {
|
|
|
|
| 9 |
sensing: SensingState;
|
| 10 |
affectOverride: Affect | null;
|
| 11 |
onAirTextConsumed: () => void;
|
| 12 |
+
onHeadSignalConsumed: () => void;
|
| 13 |
messages: ChatMessage[];
|
| 14 |
setMessages: React.Dispatch<React.SetStateAction<ChatMessage[]>>;
|
| 15 |
onLatency: (latency: LatencyLog) => void;
|
| 16 |
backendReady: boolean;
|
| 17 |
}
|
| 18 |
|
| 19 |
+
const TURNAROUND_WINDOW_MS = 5000;
|
| 20 |
+
|
| 21 |
export function ChatPanel({
|
| 22 |
userId,
|
| 23 |
personaName,
|
| 24 |
sensing,
|
| 25 |
affectOverride,
|
| 26 |
onAirTextConsumed,
|
| 27 |
+
onHeadSignalConsumed,
|
| 28 |
messages,
|
| 29 |
setMessages,
|
| 30 |
onLatency,
|
|
|
|
| 32 |
}: Props) {
|
| 33 |
const [input, setInput] = useState("");
|
| 34 |
const [loading, setLoading] = useState(false);
|
| 35 |
+
const [turnaroundLoading, setTurnaroundLoading] = useState(false);
|
| 36 |
const bottomRef = useRef<HTMLDivElement>(null);
|
| 37 |
+
const lastResponseTsRef = useRef<number>(0);
|
| 38 |
+
const lastTurnIdRef = useRef<number | null>(null);
|
| 39 |
+
// turn_id of the most recent turn that was already turned around β guards
|
| 40 |
+
// against the new turnaround bubble's own head-signal re-firing turnaround
|
| 41 |
+
// on itself.
|
| 42 |
+
const turnaroundConsumedTurnRef = useRef<number | null>(null);
|
| 43 |
|
| 44 |
useEffect(() => {
|
| 45 |
bottomRef.current?.scrollIntoView({ behavior: "smooth" });
|
| 46 |
}, [messages]);
|
| 47 |
|
| 48 |
+
// Reset per-turn state when the persona changes (parent clears `messages`
|
| 49 |
+
// and resets the backend session β the frontend turn counter must follow).
|
| 50 |
+
useEffect(() => {
|
| 51 |
+
lastTurnIdRef.current = null;
|
| 52 |
+
turnaroundConsumedTurnRef.current = null;
|
| 53 |
+
lastResponseTsRef.current = 0;
|
| 54 |
+
}, [userId]);
|
| 55 |
+
|
| 56 |
+
const handleTurnaround = useCallback(
|
| 57 |
+
async (reason: "head" | "manual") => {
|
| 58 |
+
if (!userId || !backendReady || turnaroundLoading || loading) return;
|
| 59 |
+
const targetTurnId = lastTurnIdRef.current;
|
| 60 |
+
if (targetTurnId === null) return;
|
| 61 |
+
if (turnaroundConsumedTurnRef.current === targetTurnId) return;
|
| 62 |
+
|
| 63 |
+
turnaroundConsumedTurnRef.current = targetTurnId;
|
| 64 |
+
setTurnaroundLoading(true);
|
| 65 |
+
try {
|
| 66 |
+
const res = await sendTurnaround({
|
| 67 |
+
user_id: userId,
|
| 68 |
+
turn_id: targetTurnId,
|
| 69 |
+
head_signal: reason === "head" ? sensing.headSignal : null,
|
| 70 |
+
});
|
| 71 |
+
|
| 72 |
+
lastTurnIdRef.current = res.turn_id;
|
| 73 |
+
turnaroundConsumedTurnRef.current = res.turn_id;
|
| 74 |
+
|
| 75 |
+
setMessages((prev) => {
|
| 76 |
+
const next = [...prev];
|
| 77 |
+
for (let i = next.length - 1; i >= 0; i--) {
|
| 78 |
+
if (next[i].role === "aac_user" && !next[i].isTurnaround) {
|
| 79 |
+
next[i] = { ...next[i], rephrased: true };
|
| 80 |
+
break;
|
| 81 |
+
}
|
| 82 |
+
}
|
| 83 |
+
next.push({
|
| 84 |
+
role: "aac_user",
|
| 85 |
+
content: res.response,
|
| 86 |
+
latency: res.latency,
|
| 87 |
+
affect: res.affect,
|
| 88 |
+
evalScores: res.eval_scores,
|
| 89 |
+
turnId: res.turn_id,
|
| 90 |
+
isTurnaround: true,
|
| 91 |
+
});
|
| 92 |
+
return next;
|
| 93 |
+
});
|
| 94 |
+
onLatency(res.latency);
|
| 95 |
+
// Do NOT advance lastResponseTsRef β keep the original turn's window so
|
| 96 |
+
// the user can't head-shake the turnaround itself into another loop.
|
| 97 |
+
} catch (e) {
|
| 98 |
+
setMessages((prev) => [
|
| 99 |
+
...prev,
|
| 100 |
+
{
|
| 101 |
+
role: "aac_user",
|
| 102 |
+
content: `Error rephrasing: ${
|
| 103 |
+
e instanceof Error ? e.message : "request failed"
|
| 104 |
+
}`,
|
| 105 |
+
isTurnaround: true,
|
| 106 |
+
},
|
| 107 |
+
]);
|
| 108 |
+
} finally {
|
| 109 |
+
if (reason === "head") onHeadSignalConsumed();
|
| 110 |
+
setTurnaroundLoading(false);
|
| 111 |
+
}
|
| 112 |
+
},
|
| 113 |
+
[
|
| 114 |
+
userId,
|
| 115 |
+
backendReady,
|
| 116 |
+
turnaroundLoading,
|
| 117 |
+
loading,
|
| 118 |
+
sensing.headSignal,
|
| 119 |
+
setMessages,
|
| 120 |
+
onLatency,
|
| 121 |
+
onHeadSignalConsumed,
|
| 122 |
+
]
|
| 123 |
+
);
|
| 124 |
+
|
| 125 |
+
useEffect(() => {
|
| 126 |
+
if (
|
| 127 |
+
sensing.headSignal !== "HEAD_NOD_DISSATISFIED" &&
|
| 128 |
+
sensing.headSignal !== "HEAD_SHAKE"
|
| 129 |
+
) {
|
| 130 |
+
return;
|
| 131 |
+
}
|
| 132 |
+
const targetTurnId = lastTurnIdRef.current;
|
| 133 |
+
const eligible =
|
| 134 |
+
targetTurnId !== null &&
|
| 135 |
+
turnaroundConsumedTurnRef.current !== targetTurnId &&
|
| 136 |
+
lastResponseTsRef.current > 0 &&
|
| 137 |
+
performance.now() - lastResponseTsRef.current <= TURNAROUND_WINDOW_MS;
|
| 138 |
+
|
| 139 |
+
if (eligible) {
|
| 140 |
+
handleTurnaround("head");
|
| 141 |
+
return;
|
| 142 |
+
}
|
| 143 |
+
// Not eligible β keep the chip visible briefly so the user can see that
|
| 144 |
+
// detection fired, then clear it. (Instant clear made detection invisible.)
|
| 145 |
+
const id = window.setTimeout(() => onHeadSignalConsumed(), 1500);
|
| 146 |
+
return () => window.clearTimeout(id);
|
| 147 |
+
}, [sensing.headSignal, handleTurnaround, onHeadSignalConsumed]);
|
| 148 |
+
|
| 149 |
async function handleSend() {
|
| 150 |
if (!input.trim() || !userId || !backendReady || loading) return;
|
| 151 |
|
|
|
|
| 163 |
gesture_tag: sensing.gestureTag,
|
| 164 |
gaze_bucket: sensing.gazeBucket,
|
| 165 |
air_written_text: airText,
|
| 166 |
+
head_signal: sensing.headSignal,
|
| 167 |
});
|
| 168 |
|
| 169 |
+
lastTurnIdRef.current = res.turn_id;
|
| 170 |
setMessages((prev) => [
|
| 171 |
...prev,
|
| 172 |
{
|
|
|
|
| 175 |
latency: res.latency,
|
| 176 |
affect: res.affect,
|
| 177 |
evalScores: res.eval_scores,
|
| 178 |
+
turnId: res.turn_id,
|
| 179 |
},
|
| 180 |
]);
|
| 181 |
onLatency(res.latency);
|
| 182 |
+
lastResponseTsRef.current = performance.now();
|
| 183 |
} catch (e) {
|
| 184 |
setMessages((prev) => [
|
| 185 |
...prev,
|
|
|
|
| 194 |
}
|
| 195 |
}
|
| 196 |
|
| 197 |
+
const canTurnaround =
|
| 198 |
+
!!userId &&
|
| 199 |
+
backendReady &&
|
| 200 |
+
!loading &&
|
| 201 |
+
!turnaroundLoading &&
|
| 202 |
+
lastTurnIdRef.current !== null;
|
| 203 |
+
|
| 204 |
return (
|
| 205 |
<div className="chat-panel">
|
| 206 |
<div className="chat-header">
|
|
|
|
| 208 |
</div>
|
| 209 |
<div className="chat-messages">
|
| 210 |
{messages.map((msg, i) => (
|
| 211 |
+
<div
|
| 212 |
+
key={i}
|
| 213 |
+
className={`chat-bubble ${msg.role}${
|
| 214 |
+
msg.rephrased ? " rephrased" : ""
|
| 215 |
+
}${msg.isTurnaround ? " turnaround" : ""}`}
|
| 216 |
+
>
|
| 217 |
<span className="chat-role">
|
| 218 |
{msg.role === "partner" ? "Partner" : "AAC User"}
|
| 219 |
+
{msg.rephrased && (
|
| 220 |
+
<span className="badge badge-rephrased"> rephrased</span>
|
| 221 |
+
)}
|
| 222 |
+
{msg.isTurnaround && (
|
| 223 |
+
<span className="badge badge-turnaround"> β» turnaround</span>
|
| 224 |
+
)}
|
| 225 |
</span>
|
| 226 |
<p>{msg.content}</p>
|
| 227 |
{msg.role === "aac_user" && msg.evalScores && (
|
|
|
|
| 235 |
<p>Generating...</p>
|
| 236 |
</div>
|
| 237 |
)}
|
| 238 |
+
{turnaroundLoading && (
|
| 239 |
+
<div className="chat-bubble aac_user loading">
|
| 240 |
+
<span className="chat-role">AAC User</span>
|
| 241 |
+
<p>β» Rephrasing...</p>
|
| 242 |
+
</div>
|
| 243 |
+
)}
|
| 244 |
<div ref={bottomRef} />
|
| 245 |
</div>
|
| 246 |
<div className="chat-input-row">
|
|
|
|
| 255 |
<button onClick={handleSend} disabled={!userId || loading || !backendReady || !input.trim()}>
|
| 256 |
Send
|
| 257 |
</button>
|
| 258 |
+
<button
|
| 259 |
+
type="button"
|
| 260 |
+
className="turnaround-btn"
|
| 261 |
+
onClick={() => handleTurnaround("manual")}
|
| 262 |
+
disabled={!canTurnaround}
|
| 263 |
+
title="Re-plan the last response (also triggered by a head shake / sharp nod)"
|
| 264 |
+
>
|
| 265 |
+
β» Not quite right
|
| 266 |
+
</button>
|
| 267 |
</div>
|
| 268 |
</div>
|
| 269 |
);
|
|
@@ -38,6 +38,26 @@ export function SensingStatus({ sensing, webcamActive }: Props) {
|
|
| 38 |
{sensing.gazeBucket ?? "none"}
|
| 39 |
</span>
|
| 40 |
</div>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
{sensing.airWrittenText && (
|
| 42 |
<div className="sensing-row">
|
| 43 |
<span className="sensing-label">Air-written</span>
|
|
|
|
| 38 |
{sensing.gazeBucket ?? "none"}
|
| 39 |
</span>
|
| 40 |
</div>
|
| 41 |
+
<div className="sensing-row">
|
| 42 |
+
<span className="sensing-label">Head</span>
|
| 43 |
+
<span className="sensing-value">
|
| 44 |
+
{sensing.headCalibrated
|
| 45 |
+
? sensing.headSignal ?? "steady"
|
| 46 |
+
: "not calibrated"}
|
| 47 |
+
</span>
|
| 48 |
+
</div>
|
| 49 |
+
{sensing.headCalibrated && (
|
| 50 |
+
<div className="sensing-row sensing-debug">
|
| 51 |
+
<span className="sensing-label"> β³ Ξx/Ξy</span>
|
| 52 |
+
<span className="sensing-value">
|
| 53 |
+
{sensing.headDebug.dx.toFixed(3)} / {sensing.headDebug.dy.toFixed(3)}
|
| 54 |
+
{" "}
|
| 55 |
+
(peak {sensing.headDebug.maxAbsDx.toFixed(3)}/
|
| 56 |
+
{sensing.headDebug.maxAbsDy.toFixed(3)},{" "}
|
| 57 |
+
x{sensing.headDebug.crossings})
|
| 58 |
+
</span>
|
| 59 |
+
</div>
|
| 60 |
+
)}
|
| 61 |
{sensing.airWrittenText && (
|
| 62 |
<div className="sensing-row">
|
| 63 |
<span className="sensing-label">Air-written</span>
|
|
@@ -11,6 +11,7 @@ import {
|
|
| 11 |
classifyGesture,
|
| 12 |
GazeTracker,
|
| 13 |
AirWriter,
|
|
|
|
| 14 |
} from "../lib/sensing";
|
| 15 |
|
| 16 |
const EMA_ALPHA = 0.3;
|
|
@@ -20,6 +21,9 @@ export function useSensing() {
|
|
| 20 |
const handLandmarkerRef = useRef<HandLandmarker | null>(null);
|
| 21 |
const gazeTrackerRef = useRef(new GazeTracker());
|
| 22 |
const airWriterRef = useRef(new AirWriter());
|
|
|
|
|
|
|
|
|
|
| 23 |
const neutralLCPRef = useRef<number | null>(null);
|
| 24 |
const smoothedRef = useRef({ MAR: 0, EAR: 0.3, BRI: -0.3, LCP: 0 });
|
| 25 |
const initingRef = useRef(false);
|
|
@@ -30,6 +34,9 @@ export function useSensing() {
|
|
| 30 |
gestureTag: null,
|
| 31 |
gazeBucket: null,
|
| 32 |
airWrittenText: "",
|
|
|
|
|
|
|
|
|
|
| 33 |
});
|
| 34 |
|
| 35 |
// Cleanup MediaPipe resources on unmount
|
|
@@ -95,6 +102,7 @@ export function useSensing() {
|
|
| 95 |
|
| 96 |
let affect: SensingState["affect"] = null;
|
| 97 |
let gazeBucket: SensingState["gazeBucket"] = null;
|
|
|
|
| 98 |
|
| 99 |
const faceResult = faceLandmarker.detectForVideo(video, timestamp);
|
| 100 |
if (faceResult.faceLandmarks && faceResult.faceLandmarks.length > 0) {
|
|
@@ -105,6 +113,11 @@ export function useSensing() {
|
|
| 105 |
(landmarks[61].x + landmarks[291].x) / 2;
|
| 106 |
}
|
| 107 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 108 |
const raw = computeAffectVector(landmarks, neutralLCPRef.current);
|
| 109 |
|
| 110 |
const prev = smoothedRef.current;
|
|
@@ -118,6 +131,8 @@ export function useSensing() {
|
|
| 118 |
|
| 119 |
affect = classifyAffect(smoothed);
|
| 120 |
gazeBucket = gazeTrackerRef.current.process(landmarks);
|
|
|
|
|
|
|
| 121 |
}
|
| 122 |
|
| 123 |
let gestureTag: SensingState["gestureTag"] = null;
|
|
@@ -144,6 +159,9 @@ export function useSensing() {
|
|
| 144 |
airWrittenText: newAirText
|
| 145 |
? prev.airWrittenText + newAirText
|
| 146 |
: prev.airWrittenText,
|
|
|
|
|
|
|
|
|
|
| 147 |
}));
|
| 148 |
},
|
| 149 |
[]
|
|
@@ -153,12 +171,40 @@ export function useSensing() {
|
|
| 153 |
setSensing((prev) => ({ ...prev, airWrittenText: "" }));
|
| 154 |
}, []);
|
| 155 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 156 |
const resetCalibration = useCallback(() => {
|
| 157 |
neutralLCPRef.current = null;
|
| 158 |
smoothedRef.current = { MAR: 0, EAR: 0.3, BRI: -0.3, LCP: 0 };
|
| 159 |
gazeTrackerRef.current.reset();
|
| 160 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 161 |
}, []);
|
| 162 |
|
| 163 |
-
return {
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 164 |
}
|
|
|
|
| 11 |
classifyGesture,
|
| 12 |
GazeTracker,
|
| 13 |
AirWriter,
|
| 14 |
+
HeadPoseTracker,
|
| 15 |
} from "../lib/sensing";
|
| 16 |
|
| 17 |
const EMA_ALPHA = 0.3;
|
|
|
|
| 21 |
const handLandmarkerRef = useRef<HandLandmarker | null>(null);
|
| 22 |
const gazeTrackerRef = useRef(new GazeTracker());
|
| 23 |
const airWriterRef = useRef(new AirWriter());
|
| 24 |
+
const headTrackerRef = useRef(new HeadPoseTracker());
|
| 25 |
+
const calibratePendingRef = useRef(false);
|
| 26 |
+
const headDebugRef = useRef({ dx: 0, dy: 0, maxAbsDx: 0, maxAbsDy: 0, crossings: 0 });
|
| 27 |
const neutralLCPRef = useRef<number | null>(null);
|
| 28 |
const smoothedRef = useRef({ MAR: 0, EAR: 0.3, BRI: -0.3, LCP: 0 });
|
| 29 |
const initingRef = useRef(false);
|
|
|
|
| 34 |
gestureTag: null,
|
| 35 |
gazeBucket: null,
|
| 36 |
airWrittenText: "",
|
| 37 |
+
headSignal: null,
|
| 38 |
+
headCalibrated: false,
|
| 39 |
+
headDebug: { dx: 0, dy: 0, maxAbsDx: 0, maxAbsDy: 0, crossings: 0 },
|
| 40 |
});
|
| 41 |
|
| 42 |
// Cleanup MediaPipe resources on unmount
|
|
|
|
| 102 |
|
| 103 |
let affect: SensingState["affect"] = null;
|
| 104 |
let gazeBucket: SensingState["gazeBucket"] = null;
|
| 105 |
+
let headSignal: SensingState["headSignal"] = null;
|
| 106 |
|
| 107 |
const faceResult = faceLandmarker.detectForVideo(video, timestamp);
|
| 108 |
if (faceResult.faceLandmarks && faceResult.faceLandmarks.length > 0) {
|
|
|
|
| 113 |
(landmarks[61].x + landmarks[291].x) / 2;
|
| 114 |
}
|
| 115 |
|
| 116 |
+
if (calibratePendingRef.current) {
|
| 117 |
+
headTrackerRef.current.calibrate(landmarks);
|
| 118 |
+
calibratePendingRef.current = false;
|
| 119 |
+
}
|
| 120 |
+
|
| 121 |
const raw = computeAffectVector(landmarks, neutralLCPRef.current);
|
| 122 |
|
| 123 |
const prev = smoothedRef.current;
|
|
|
|
| 131 |
|
| 132 |
affect = classifyAffect(smoothed);
|
| 133 |
gazeBucket = gazeTrackerRef.current.process(landmarks);
|
| 134 |
+
headSignal = headTrackerRef.current.process(landmarks);
|
| 135 |
+
headDebugRef.current = headTrackerRef.current.debug;
|
| 136 |
}
|
| 137 |
|
| 138 |
let gestureTag: SensingState["gestureTag"] = null;
|
|
|
|
| 159 |
airWrittenText: newAirText
|
| 160 |
? prev.airWrittenText + newAirText
|
| 161 |
: prev.airWrittenText,
|
| 162 |
+
headSignal: headSignal ?? prev.headSignal,
|
| 163 |
+
headCalibrated: headTrackerRef.current.calibrated,
|
| 164 |
+
headDebug: headDebugRef.current,
|
| 165 |
}));
|
| 166 |
},
|
| 167 |
[]
|
|
|
|
| 171 |
setSensing((prev) => ({ ...prev, airWrittenText: "" }));
|
| 172 |
}, []);
|
| 173 |
|
| 174 |
+
const clearHeadSignal = useCallback(() => {
|
| 175 |
+
setSensing((prev) => ({ ...prev, headSignal: null }));
|
| 176 |
+
}, []);
|
| 177 |
+
|
| 178 |
+
const calibrateHeadPose = useCallback(() => {
|
| 179 |
+
calibratePendingRef.current = true;
|
| 180 |
+
setSensing((prev) => ({ ...prev, headSignal: null }));
|
| 181 |
+
}, []);
|
| 182 |
+
|
| 183 |
const resetCalibration = useCallback(() => {
|
| 184 |
neutralLCPRef.current = null;
|
| 185 |
smoothedRef.current = { MAR: 0, EAR: 0.3, BRI: -0.3, LCP: 0 };
|
| 186 |
gazeTrackerRef.current.reset();
|
| 187 |
+
headTrackerRef.current.reset();
|
| 188 |
+
setSensing({
|
| 189 |
+
affect: null,
|
| 190 |
+
gestureTag: null,
|
| 191 |
+
gazeBucket: null,
|
| 192 |
+
airWrittenText: "",
|
| 193 |
+
headSignal: null,
|
| 194 |
+
headCalibrated: false,
|
| 195 |
+
headDebug: { dx: 0, dy: 0, maxAbsDx: 0, maxAbsDy: 0, crossings: 0 },
|
| 196 |
+
});
|
| 197 |
}, []);
|
| 198 |
|
| 199 |
+
return {
|
| 200 |
+
sensing,
|
| 201 |
+
ready,
|
| 202 |
+
initError,
|
| 203 |
+
init,
|
| 204 |
+
processFrame,
|
| 205 |
+
clearAirWrittenText,
|
| 206 |
+
clearHeadSignal,
|
| 207 |
+
calibrateHeadPose,
|
| 208 |
+
resetCalibration,
|
| 209 |
+
};
|
| 210 |
}
|
|
@@ -1,4 +1,9 @@
|
|
| 1 |
-
import type {
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
const API_BASE = "";
|
| 4 |
|
|
@@ -19,6 +24,18 @@ export async function sendChat(req: ChatRequest): Promise<ChatResponse> {
|
|
| 19 |
return res.json();
|
| 20 |
}
|
| 21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
export async function resetSession(userId: string): Promise<void> {
|
| 23 |
const res = await fetch(
|
| 24 |
`${API_BASE}/session/reset?user_id=${encodeURIComponent(userId)}`,
|
|
|
|
| 1 |
+
import type {
|
| 2 |
+
ChatRequest,
|
| 3 |
+
ChatResponse,
|
| 4 |
+
Persona,
|
| 5 |
+
TurnaroundRequest,
|
| 6 |
+
} from "../types";
|
| 7 |
|
| 8 |
const API_BASE = "";
|
| 9 |
|
|
|
|
| 24 |
return res.json();
|
| 25 |
}
|
| 26 |
|
| 27 |
+
export async function sendTurnaround(
|
| 28 |
+
req: TurnaroundRequest
|
| 29 |
+
): Promise<ChatResponse> {
|
| 30 |
+
const res = await fetch(`${API_BASE}/chat/turnaround`, {
|
| 31 |
+
method: "POST",
|
| 32 |
+
headers: { "Content-Type": "application/json" },
|
| 33 |
+
body: JSON.stringify(req),
|
| 34 |
+
});
|
| 35 |
+
if (!res.ok) throw new Error(`API error: ${res.status}`);
|
| 36 |
+
return res.json();
|
| 37 |
+
}
|
| 38 |
+
|
| 39 |
export async function resetSession(userId: string): Promise<void> {
|
| 40 |
const res = await fetch(
|
| 41 |
`${API_BASE}/session/reset?user_id=${encodeURIComponent(userId)}`,
|
|
@@ -189,6 +189,227 @@ export class GazeTracker {
|
|
| 189 |
}
|
| 190 |
}
|
| 191 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 192 |
// ββ Air-writing DTW (ported from backend/sensing/air_writing.py) βββββββββββββ
|
| 193 |
|
| 194 |
const INDEX_TIP = 8;
|
|
|
|
| 189 |
}
|
| 190 |
}
|
| 191 |
|
| 192 |
+
// ββ Head-pose tracker (shake / sharp-nod-dissatisfied) ββββββββββββββββββββββ
|
| 193 |
+
|
| 194 |
+
export type HeadSignal = "HEAD_SHAKE" | "HEAD_NOD_DISSATISFIED";
|
| 195 |
+
|
| 196 |
+
const NOSE_TIP = 1;
|
| 197 |
+
|
| 198 |
+
interface NosePoint {
|
| 199 |
+
x: number;
|
| 200 |
+
y: number;
|
| 201 |
+
t: number;
|
| 202 |
+
}
|
| 203 |
+
|
| 204 |
+
export interface HeadDebug {
|
| 205 |
+
dx: number; // current x displacement from neutral
|
| 206 |
+
dy: number; // current y displacement from neutral
|
| 207 |
+
maxAbsDx: number; // peak |dx| within the window
|
| 208 |
+
maxAbsDy: number; // peak |dy| within the window
|
| 209 |
+
crossings: number; // side crossings within the window (deadband-filtered)
|
| 210 |
+
}
|
| 211 |
+
|
| 212 |
+
export class HeadPoseTracker {
|
| 213 |
+
private neutralX: number | null = null;
|
| 214 |
+
private neutralY: number | null = null;
|
| 215 |
+
private history: NosePoint[] = [];
|
| 216 |
+
private lastEmitTs = 0;
|
| 217 |
+
private lastDebug: HeadDebug = {
|
| 218 |
+
dx: 0,
|
| 219 |
+
dy: 0,
|
| 220 |
+
maxAbsDx: 0,
|
| 221 |
+
maxAbsDy: 0,
|
| 222 |
+
crossings: 0,
|
| 223 |
+
};
|
| 224 |
+
|
| 225 |
+
private static WINDOW_MS = 1000;
|
| 226 |
+
private static REFRACTORY_MS = 2000;
|
| 227 |
+
private static SHAKE_AMPLITUDE = 0.015;
|
| 228 |
+
private static SHAKE_MIN_CROSSINGS = 3;
|
| 229 |
+
// Per-frame jitter below this magnitude is ignored when counting side
|
| 230 |
+
// crossings, so micro-fidgets near neutral can't rack up false crossings.
|
| 231 |
+
private static SHAKE_DEADBAND = 0.005;
|
| 232 |
+
private static NOD_DROP = 0.06;
|
| 233 |
+
private static NOD_WINDOW_MS = 600;
|
| 234 |
+
// Reject "nod" when horizontal motion exceeds this β it's a shake/sway.
|
| 235 |
+
private static NOD_MAX_HORIZONTAL = 0.015;
|
| 236 |
+
// Recovery: head must come back to within this of neutral.
|
| 237 |
+
private static NOD_RECOVERY = 0.015;
|
| 238 |
+
// The drop must start from near-neutral (not from a tilted resting pose).
|
| 239 |
+
private static NOD_START_THRESHOLD = 0.015;
|
| 240 |
+
// Minimum frames between drop start and peak β guards against single-frame
|
| 241 |
+
// landmark glitches that look like an instantaneous jerk.
|
| 242 |
+
private static NOD_MIN_DROP_FRAMES = 3;
|
| 243 |
+
// Minimum frames between peak and recovery β same reason, going up.
|
| 244 |
+
private static NOD_MIN_RECOVERY_FRAMES = 2;
|
| 245 |
+
|
| 246 |
+
calibrate(landmarks: { x: number; y: number }[]): void {
|
| 247 |
+
if (!landmarks[NOSE_TIP]) return;
|
| 248 |
+
this.neutralX = landmarks[NOSE_TIP].x;
|
| 249 |
+
this.neutralY = landmarks[NOSE_TIP].y;
|
| 250 |
+
this.history = [];
|
| 251 |
+
this.lastEmitTs = 0;
|
| 252 |
+
}
|
| 253 |
+
|
| 254 |
+
process(landmarks: { x: number; y: number }[]): HeadSignal | null {
|
| 255 |
+
if (!landmarks[NOSE_TIP]) return null;
|
| 256 |
+
if (this.neutralX === null || this.neutralY === null) return null;
|
| 257 |
+
|
| 258 |
+
const now = performance.now();
|
| 259 |
+
const nose = landmarks[NOSE_TIP];
|
| 260 |
+
this.history.push({ x: nose.x, y: nose.y, t: now });
|
| 261 |
+
const cutoff = now - HeadPoseTracker.WINDOW_MS;
|
| 262 |
+
this.history = this.history.filter((p) => p.t >= cutoff);
|
| 263 |
+
|
| 264 |
+
this.updateDebug(nose);
|
| 265 |
+
|
| 266 |
+
if (now - this.lastEmitTs < HeadPoseTracker.REFRACTORY_MS) return null;
|
| 267 |
+
if (this.history.length < 6) return null;
|
| 268 |
+
|
| 269 |
+
const shake = this.detectShake();
|
| 270 |
+
if (shake) {
|
| 271 |
+
this.lastEmitTs = now;
|
| 272 |
+
return shake;
|
| 273 |
+
}
|
| 274 |
+
|
| 275 |
+
const nod = this.detectNod(now);
|
| 276 |
+
if (nod) {
|
| 277 |
+
this.lastEmitTs = now;
|
| 278 |
+
return nod;
|
| 279 |
+
}
|
| 280 |
+
|
| 281 |
+
return null;
|
| 282 |
+
}
|
| 283 |
+
|
| 284 |
+
private updateDebug(nose: { x: number; y: number }): void {
|
| 285 |
+
if (this.neutralX === null || this.neutralY === null) return;
|
| 286 |
+
let maxAbsDx = 0;
|
| 287 |
+
let maxAbsDy = 0;
|
| 288 |
+
let crossings = 0;
|
| 289 |
+
let prevSide = 0;
|
| 290 |
+
for (const p of this.history) {
|
| 291 |
+
const dx = p.x - this.neutralX;
|
| 292 |
+
const dy = p.y - this.neutralY;
|
| 293 |
+
const absDx = Math.abs(dx);
|
| 294 |
+
maxAbsDx = Math.max(maxAbsDx, absDx);
|
| 295 |
+
maxAbsDy = Math.max(maxAbsDy, Math.abs(dy));
|
| 296 |
+
if (absDx < HeadPoseTracker.SHAKE_DEADBAND) continue;
|
| 297 |
+
const side = dx > 0 ? 1 : -1;
|
| 298 |
+
if (prevSide !== 0 && side !== prevSide) crossings += 1;
|
| 299 |
+
prevSide = side;
|
| 300 |
+
}
|
| 301 |
+
this.lastDebug = {
|
| 302 |
+
dx: nose.x - this.neutralX,
|
| 303 |
+
dy: nose.y - this.neutralY,
|
| 304 |
+
maxAbsDx,
|
| 305 |
+
maxAbsDy,
|
| 306 |
+
crossings,
|
| 307 |
+
};
|
| 308 |
+
}
|
| 309 |
+
|
| 310 |
+
get debug(): HeadDebug {
|
| 311 |
+
return this.lastDebug;
|
| 312 |
+
}
|
| 313 |
+
|
| 314 |
+
private detectShake(): HeadSignal | null {
|
| 315 |
+
if (this.neutralX === null) return null;
|
| 316 |
+
let crossings = 0;
|
| 317 |
+
let prevSide = 0;
|
| 318 |
+
let maxAbs = 0;
|
| 319 |
+
for (const p of this.history) {
|
| 320 |
+
const dx = p.x - this.neutralX;
|
| 321 |
+
const absDx = Math.abs(dx);
|
| 322 |
+
maxAbs = Math.max(maxAbs, absDx);
|
| 323 |
+
// Only commit to a side once the displacement clears the deadband β
|
| 324 |
+
// otherwise sub-millimeter jitter near neutral fakes crossings.
|
| 325 |
+
if (absDx < HeadPoseTracker.SHAKE_DEADBAND) continue;
|
| 326 |
+
const side = dx > 0 ? 1 : -1;
|
| 327 |
+
if (prevSide !== 0 && side !== prevSide) crossings += 1;
|
| 328 |
+
prevSide = side;
|
| 329 |
+
}
|
| 330 |
+
if (
|
| 331 |
+
crossings >= HeadPoseTracker.SHAKE_MIN_CROSSINGS &&
|
| 332 |
+
maxAbs >= HeadPoseTracker.SHAKE_AMPLITUDE
|
| 333 |
+
) {
|
| 334 |
+
return "HEAD_SHAKE";
|
| 335 |
+
}
|
| 336 |
+
return null;
|
| 337 |
+
}
|
| 338 |
+
|
| 339 |
+
private detectNod(now: number): HeadSignal | null {
|
| 340 |
+
if (this.neutralX === null || this.neutralY === null) return null;
|
| 341 |
+
const windowStart = now - HeadPoseTracker.NOD_WINDOW_MS;
|
| 342 |
+
const recent = this.history.filter((p) => p.t >= windowStart);
|
| 343 |
+
if (recent.length < 6) return null;
|
| 344 |
+
|
| 345 |
+
// Reject if there's significant horizontal motion β that's a shake/sway.
|
| 346 |
+
let maxAbsDx = 0;
|
| 347 |
+
for (const p of recent) {
|
| 348 |
+
maxAbsDx = Math.max(maxAbsDx, Math.abs(p.x - this.neutralX));
|
| 349 |
+
}
|
| 350 |
+
if (maxAbsDx > HeadPoseTracker.NOD_MAX_HORIZONTAL) return null;
|
| 351 |
+
|
| 352 |
+
// Find the peak (lowest head position) within the window.
|
| 353 |
+
let maxDrop = 0;
|
| 354 |
+
let peakIdx = -1;
|
| 355 |
+
for (let i = 0; i < recent.length; i++) {
|
| 356 |
+
const drop = recent[i].y - this.neutralY;
|
| 357 |
+
if (drop > maxDrop) {
|
| 358 |
+
maxDrop = drop;
|
| 359 |
+
peakIdx = i;
|
| 360 |
+
}
|
| 361 |
+
}
|
| 362 |
+
if (maxDrop < HeadPoseTracker.NOD_DROP || peakIdx < 0) return null;
|
| 363 |
+
|
| 364 |
+
// Find a near-neutral start before the peak β a nod is a deliberate
|
| 365 |
+
// motion *from* neutral, not a recovery from an already-tilted pose.
|
| 366 |
+
let startIdx = -1;
|
| 367 |
+
for (let i = peakIdx - 1; i >= 0; i--) {
|
| 368 |
+
if (
|
| 369 |
+
recent[i].y - this.neutralY <=
|
| 370 |
+
HeadPoseTracker.NOD_START_THRESHOLD
|
| 371 |
+
) {
|
| 372 |
+
startIdx = i;
|
| 373 |
+
break;
|
| 374 |
+
}
|
| 375 |
+
}
|
| 376 |
+
if (
|
| 377 |
+
startIdx < 0 ||
|
| 378 |
+
peakIdx - startIdx < HeadPoseTracker.NOD_MIN_DROP_FRAMES
|
| 379 |
+
) {
|
| 380 |
+
return null;
|
| 381 |
+
}
|
| 382 |
+
|
| 383 |
+
// Recovery: head must return near neutral after the peak.
|
| 384 |
+
let recoveryIdx = -1;
|
| 385 |
+
for (let i = peakIdx + 1; i < recent.length; i++) {
|
| 386 |
+
if (recent[i].y - this.neutralY < HeadPoseTracker.NOD_RECOVERY) {
|
| 387 |
+
recoveryIdx = i;
|
| 388 |
+
break;
|
| 389 |
+
}
|
| 390 |
+
}
|
| 391 |
+
if (
|
| 392 |
+
recoveryIdx < 0 ||
|
| 393 |
+
recoveryIdx - peakIdx < HeadPoseTracker.NOD_MIN_RECOVERY_FRAMES
|
| 394 |
+
) {
|
| 395 |
+
return null;
|
| 396 |
+
}
|
| 397 |
+
|
| 398 |
+
return "HEAD_NOD_DISSATISFIED";
|
| 399 |
+
}
|
| 400 |
+
|
| 401 |
+
reset(): void {
|
| 402 |
+
this.neutralX = null;
|
| 403 |
+
this.neutralY = null;
|
| 404 |
+
this.history = [];
|
| 405 |
+
this.lastEmitTs = 0;
|
| 406 |
+
}
|
| 407 |
+
|
| 408 |
+
get calibrated(): boolean {
|
| 409 |
+
return this.neutralX !== null && this.neutralY !== null;
|
| 410 |
+
}
|
| 411 |
+
}
|
| 412 |
+
|
| 413 |
// ββ Air-writing DTW (ported from backend/sensing/air_writing.py) βββββββββββββ
|
| 414 |
|
| 415 |
const INDEX_TIP = 8;
|
|
@@ -1,12 +1,24 @@
|
|
| 1 |
export type Affect = "HAPPY" | "FRUSTRATED" | "NEUTRAL" | "SURPRISED";
|
| 2 |
export type GestureName = "THUMBS_UP" | "THUMBS_DOWN" | "POINTING" | "WAVING";
|
| 3 |
export type MemoryBucket = "family" | "medical" | "hobbies" | "daily_routine" | "social";
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
|
| 5 |
export interface SensingState {
|
| 6 |
affect: Affect | null;
|
| 7 |
gestureTag: GestureName | null;
|
| 8 |
gazeBucket: MemoryBucket | null;
|
| 9 |
airWrittenText: string;
|
|
|
|
|
|
|
|
|
|
| 10 |
}
|
| 11 |
|
| 12 |
export interface Persona {
|
|
@@ -23,6 +35,13 @@ export interface ChatRequest {
|
|
| 23 |
gesture_tag: GestureName | null;
|
| 24 |
gaze_bucket: MemoryBucket | null;
|
| 25 |
air_written_text: string | null;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
}
|
| 27 |
|
| 28 |
export interface LatencyLog {
|
|
@@ -57,6 +76,7 @@ export interface ChatResponse {
|
|
| 57 |
latency: LatencyLog;
|
| 58 |
guardrail_passed: boolean;
|
| 59 |
eval_scores: EvalScores | null;
|
|
|
|
| 60 |
}
|
| 61 |
|
| 62 |
export interface ChatMessage {
|
|
@@ -65,4 +85,7 @@ export interface ChatMessage {
|
|
| 65 |
latency?: LatencyLog;
|
| 66 |
affect?: string;
|
| 67 |
evalScores?: EvalScores | null;
|
|
|
|
|
|
|
|
|
|
| 68 |
}
|
|
|
|
| 1 |
export type Affect = "HAPPY" | "FRUSTRATED" | "NEUTRAL" | "SURPRISED";
|
| 2 |
export type GestureName = "THUMBS_UP" | "THUMBS_DOWN" | "POINTING" | "WAVING";
|
| 3 |
export type MemoryBucket = "family" | "medical" | "hobbies" | "daily_routine" | "social";
|
| 4 |
+
export type HeadSignal = "HEAD_SHAKE" | "HEAD_NOD_DISSATISFIED";
|
| 5 |
+
|
| 6 |
+
export interface HeadDebug {
|
| 7 |
+
dx: number;
|
| 8 |
+
dy: number;
|
| 9 |
+
maxAbsDx: number;
|
| 10 |
+
maxAbsDy: number;
|
| 11 |
+
crossings: number;
|
| 12 |
+
}
|
| 13 |
|
| 14 |
export interface SensingState {
|
| 15 |
affect: Affect | null;
|
| 16 |
gestureTag: GestureName | null;
|
| 17 |
gazeBucket: MemoryBucket | null;
|
| 18 |
airWrittenText: string;
|
| 19 |
+
headSignal: HeadSignal | null;
|
| 20 |
+
headCalibrated: boolean;
|
| 21 |
+
headDebug: HeadDebug;
|
| 22 |
}
|
| 23 |
|
| 24 |
export interface Persona {
|
|
|
|
| 35 |
gesture_tag: GestureName | null;
|
| 36 |
gaze_bucket: MemoryBucket | null;
|
| 37 |
air_written_text: string | null;
|
| 38 |
+
head_signal?: HeadSignal | null;
|
| 39 |
+
}
|
| 40 |
+
|
| 41 |
+
export interface TurnaroundRequest {
|
| 42 |
+
user_id: string;
|
| 43 |
+
turn_id?: number;
|
| 44 |
+
head_signal?: HeadSignal | null;
|
| 45 |
}
|
| 46 |
|
| 47 |
export interface LatencyLog {
|
|
|
|
| 76 |
latency: LatencyLog;
|
| 77 |
guardrail_passed: boolean;
|
| 78 |
eval_scores: EvalScores | null;
|
| 79 |
+
turn_id: number;
|
| 80 |
}
|
| 81 |
|
| 82 |
export interface ChatMessage {
|
|
|
|
| 85 |
latency?: LatencyLog;
|
| 86 |
affect?: string;
|
| 87 |
evalScores?: EvalScores | null;
|
| 88 |
+
turnId?: number;
|
| 89 |
+
rephrased?: boolean;
|
| 90 |
+
isTurnaround?: boolean;
|
| 91 |
}
|