shwetangisingh commited on
Commit
ce51e88
Β·
1 Parent(s): 84b82bd

Add PRESENT_STATE intent + head-shake turnaround

Browse files

Right-now questions ("how are you feeling?") were fabricating answers
from autobiographical memory. PRESENT_STATE is a new intent class that
skips retrieval and uses an affect-grounded prompt branch, falling back
to "I'm not sure" when the read is ambiguous.

Head-shake or sharp head-nod (or the manual "Not quite right" button)
hits a /chat/turnaround endpoint that re-plans the last turn:
- PERSONAL turns re-retrieve excluding the rejected bucket and exact
chunk texts, with a score floor so the rephrase doesn't degrade into
off-topic memories
- PRESENT_STATE turns flip the emotional read or admit uncertainty

Head-pose detection is client-side (MediaPipe nose-tip), with a
calibrate button and live Ξ”x/Ξ”y debug readout. Both bubbles stay
visible after turnaround β€” rejected gets strikethrough + "rephrased"
badge, new one gets "↻ turnaround".

README.md CHANGED
@@ -353,9 +353,12 @@ Heads up: all camera/sensing stuff is in the frontend (MediaPipe JS). Backend ju
353
 
354
  ### Sensing (frontend)
355
 
356
- - [ ] **[Core]** Head-nod / sharp tilt = "I don't like that". Different from frustrated affect.
357
- - [ ] send a `dissatisfaction_signal` flag with the chat request
358
- - [ ] when set, planner returns a "did you mean X or Y?" instead of an answer (the spec's "Turnaround Option")
 
 
 
359
  - [ ] **[Core]** Smile / positive affect should actually change the wording (more positive lexicon), not just be metadata. Right now it's annotated in the prompt but we never checked if the LLM is doing anything with it β€” probably need a stronger constraint or example in the prompt
360
  - [ ] **[Core]** Air-writing is treated as raw text appended to the query. Spec wants it as a stylistic constraint too β€” should it bias tone, or stay query-only? Decide and document
361
  - [ ] **[Bonus]** Voice + air-writing conflict resolution. Capture short voice (Web Speech API), compare to air-written intent, send a `resolved_intent`
@@ -363,10 +366,11 @@ Heads up: all camera/sensing stuff is in the frontend (MediaPipe JS). Backend ju
363
 
364
  ### Intent decomposition
365
 
366
- > Current state: regex-splits the partner query on conjunctions/punctuation into fragments, then runs each fragment through a BGE zero-shot classifier (cosine vs. 5 seed exemplars per class). No LLM call, no retries. Runs in ~10–30ms per turn. Bucket hints for `PERSONAL` fragments come from a shared keyword helper in [backend/sensing/bucket_keywords.py](backend/sensing/bucket_keywords.py). Earlier versions used an LLM with Pydantic validation + 3 retries, which cost ~100s per turn on Ollama Cloud when the model emitted bad JSON.
367
 
368
  - [x] **[Core]** Personal / Contextual / Open-domain dispatch to distinct pools (personal β†’ BGE vector store; contextual β†’ persona memory + relevant in-session turns layered on top; open-domain β†’ stub chunk, LLM answers from its own general knowledge β€” web search is intentionally out of scope).
369
  - [x] intent node latency β€” split + BGE zero-shot classifier replaces the LLM router. Parallelising sub-query retrieval is still open.
 
370
 
371
  ### Retrieval
372
 
 
353
 
354
  ### Sensing (frontend)
355
 
356
+ - [x] **[Core]** Head-nod / sharp tilt / head-shake = "I don't like that". Different from frustrated affect.
357
+ - [x] frontend `HeadPoseTracker` (deadband-filtered shake + sharp-nod-with-recovery), explicit calibrate button, live Ξ”x/Ξ”y debug readout in sidebar
358
+ - [x] dedicated `POST /chat/turnaround` endpoint reuses cached last-state β€” one extra LLM call, no full pipeline re-run
359
+ - [x] intent-aware turnaround: PERSONAL re-retrieves excluding the rejected bucket *and* exact rejected chunk texts (with `turnaround_min_score` floor β€” falls back to original chunks rather than degrading); PRESENT_STATE flips emotional read or admits uncertainty
360
+ - [x] UI: rejected bubble gets strikethrough + "rephrased" badge, new bubble appended with "↻ turnaround" badge β€” both visible (you can't unsay something to a partner). Manual "↻ Not quite right" button as fallback
361
+ - [x] guards: `turnaroundConsumedTurnRef` prevents self-retrigger loops; backend `turn_id` returned in `ChatResponse` so frontend doesn't desync on persona switch; stale-turn 409
362
  - [ ] **[Core]** Smile / positive affect should actually change the wording (more positive lexicon), not just be metadata. Right now it's annotated in the prompt but we never checked if the LLM is doing anything with it β€” probably need a stronger constraint or example in the prompt
363
  - [ ] **[Core]** Air-writing is treated as raw text appended to the query. Spec wants it as a stylistic constraint too β€” should it bias tone, or stay query-only? Decide and document
364
  - [ ] **[Bonus]** Voice + air-writing conflict resolution. Capture short voice (Web Speech API), compare to air-written intent, send a `resolved_intent`
 
366
 
367
  ### Intent decomposition
368
 
369
+ > Current state: regex-splits the partner query on conjunctions/punctuation into fragments, then runs each fragment through a BGE zero-shot classifier (cosine vs. seed exemplars per class). No LLM call, no retries. Runs in ~10–30ms per turn. Bucket hints for `PERSONAL` fragments come from a shared keyword helper in [backend/sensing/bucket_keywords.py](backend/sensing/bucket_keywords.py). Earlier versions used an LLM with Pydantic validation + 3 retries, which cost ~100s per turn on Ollama Cloud when the model emitted bad JSON.
370
 
371
  - [x] **[Core]** Personal / Contextual / Open-domain dispatch to distinct pools (personal β†’ BGE vector store; contextual β†’ persona memory + relevant in-session turns layered on top; open-domain β†’ stub chunk, LLM answers from its own general knowledge β€” web search is intentionally out of scope).
372
  - [x] intent node latency β€” split + BGE zero-shot classifier replaces the LLM router. Parallelising sub-query retrieval is still open.
373
+ - [x] **[Core]** `PRESENT_STATE` intent class β€” questions about right-now state ("how are you feeling?", "are you tired?") used to fabricate confident answers from autobiographical memory (wrong by category, not just by wording). Now they skip retrieval entirely and the planner uses an affect-grounded prompt branch with explicit fallback to "I'm not sure" when the read is ambiguous. Margin guard demotes narrow PRESENT_STATE wins to PERSONAL (better to over-retrieve than to silently drop persona memories). Air-written supplements are classified the same way as a normal fragment β€” a present-tense supplement on a PRESENT_STATE query no longer flips the route to PERSONAL.
374
 
375
  ### Retrieval
376
 
backend/api/main.py CHANGED
@@ -2,6 +2,7 @@
2
  from __future__ import annotations
3
 
4
  import json
 
5
 
6
  from fastapi import FastAPI, HTTPException
7
  from fastapi.middleware.cors import CORSMiddleware
@@ -15,9 +16,12 @@ from backend.generation.llm_client import ( # active_model used by /debug/confi
15
  )
16
  from backend.guardrails.checks import check_input
17
  from backend.pipeline.graph import run_pipeline
 
 
 
18
  from backend.pipeline.state import PipelineState
19
  from backend.retrieval.bucket_priors import uniform_priors
20
- from backend.retrieval.vector_store import _get_embedder
21
 
22
  app = FastAPI(
23
  title="Multimodal AAC Chatbot API",
@@ -32,6 +36,7 @@ app.add_middleware(
32
  allow_headers=["*"],
33
  )
34
 
 
35
  _models_ready = False
36
 
37
 
@@ -64,6 +69,13 @@ class ChatRequest(BaseModel):
64
  gesture_tag: str | None = None
65
  gaze_bucket: str | None = None
66
  air_written_text: str | None = None
 
 
 
 
 
 
 
67
 
68
 
69
  class EvalScoresResponse(BaseModel):
@@ -91,6 +103,7 @@ class ChatResponse(BaseModel):
91
  latency: dict
92
  guardrail_passed: bool
93
  eval_scores: EvalScoresResponse | None = None
 
94
 
95
 
96
  # ── Helpers ────────────────────────────────────────────────────────────────────
@@ -145,6 +158,8 @@ def _build_initial_state(req: ChatRequest, session: dict) -> PipelineState:
145
  gesture_tag=req.gesture_tag,
146
  gaze_bucket=req.gaze_bucket,
147
  air_written_text=req.air_written_text,
 
 
148
  raw_query=req.query,
149
  intent_route=None,
150
  generation_config=None,
@@ -168,6 +183,58 @@ def _build_initial_state(req: ChatRequest, session: dict) -> PipelineState:
168
  )
169
 
170
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
171
  # ── Routes ─────────────────────────────────────────────────────────────────────
172
 
173
 
@@ -222,6 +289,7 @@ def chat(req: ChatRequest):
222
  retrieval_mode="none",
223
  latency={},
224
  guardrail_passed=False,
 
225
  )
226
 
227
  session = _get_or_init_session(req.user_id)
@@ -232,6 +300,7 @@ def chat(req: ChatRequest):
232
  # Persist updated session state
233
  session["session_history"] = result["session_history"]
234
  session["bucket_priors"] = result["bucket_priors"]
 
235
 
236
  # Compute evaluation metrics
237
  affect_emotion = (result.get("affect") or {}).get("emotion", "NEUTRAL")
@@ -256,4 +325,101 @@ def chat(req: ChatRequest):
256
  latency=result.get("latency_log") or {},
257
  guardrail_passed=result.get("guardrail_passed", True),
258
  eval_scores=eval_scores,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
259
  )
 
2
  from __future__ import annotations
3
 
4
  import json
5
+ import logging
6
 
7
  from fastapi import FastAPI, HTTPException
8
  from fastapi.middleware.cors import CORSMiddleware
 
16
  )
17
  from backend.guardrails.checks import check_input
18
  from backend.pipeline.graph import run_pipeline
19
+ from backend.pipeline.intent_kind import classify_intent_kind
20
+ from backend.pipeline.nodes import feedback as feedback_node
21
+ from backend.pipeline.nodes import planner as planner_node
22
  from backend.pipeline.state import PipelineState
23
  from backend.retrieval.bucket_priors import uniform_priors
24
+ from backend.retrieval.vector_store import _get_embedder, retrieve
25
 
26
  app = FastAPI(
27
  title="Multimodal AAC Chatbot API",
 
36
  allow_headers=["*"],
37
  )
38
 
39
+ _log = logging.getLogger(__name__)
40
  _models_ready = False
41
 
42
 
 
69
  gesture_tag: str | None = None
70
  gaze_bucket: str | None = None
71
  air_written_text: str | None = None
72
+ head_signal: str | None = None # "HEAD_SHAKE"|"HEAD_NOD_DISSATISFIED"
73
+
74
+
75
+ class TurnaroundRequest(BaseModel):
76
+ user_id: str
77
+ turn_id: int | None = None # optional guard against stale turnaround calls
78
+ head_signal: str | None = None
79
 
80
 
81
  class EvalScoresResponse(BaseModel):
 
103
  latency: dict
104
  guardrail_passed: bool
105
  eval_scores: EvalScoresResponse | None = None
106
+ turn_id: int
107
 
108
 
109
  # ── Helpers ────────────────────────────────────────────────────────────────────
 
158
  gesture_tag=req.gesture_tag,
159
  gaze_bucket=req.gaze_bucket,
160
  air_written_text=req.air_written_text,
161
+ head_signal=req.head_signal,
162
+ turnaround_triggered=False,
163
  raw_query=req.query,
164
  intent_route=None,
165
  generation_config=None,
 
183
  )
184
 
185
 
186
+ def _re_retrieve_excluding(
187
+ query: str,
188
+ user_id: str,
189
+ rejected_chunks: list[dict],
190
+ ) -> list[dict] | None:
191
+ """Pull fresh chunks for a turnaround, excluding the bucket and exact texts
192
+ of the rejected chunks.
193
+
194
+ Returns:
195
+ - list of chunks (passing min-score floor) when re-retrieval improved
196
+ on the rejected set
197
+ - None when re-retrieval should not be used (no signal, all dropped by
198
+ dedupe, or all below score floor) β€” caller should keep original chunks
199
+ """
200
+ if not rejected_chunks:
201
+ return None
202
+ rejected_bucket = rejected_chunks[0].get("bucket")
203
+ rejected_texts = {c.get("text") for c in rejected_chunks if c.get("text")}
204
+ if not rejected_bucket:
205
+ return None
206
+
207
+ try:
208
+ # Pull a wider net (top_k * 2) so dedupe + bucket-exclusion still leaves
209
+ # enough candidates to fill rerank_k.
210
+ fresh = retrieve(
211
+ query=query,
212
+ user_id=user_id,
213
+ top_k=settings.retrieval_top_k * 2,
214
+ rerank_k=settings.retrieval_top_k * 2,
215
+ bucket_filter=None,
216
+ )
217
+ except Exception as exc:
218
+ _log.warning("turnaround re-retrieval failed: %r", exc)
219
+ return None
220
+
221
+ filtered = [
222
+ c
223
+ for c in fresh
224
+ if c.get("bucket") != rejected_bucket
225
+ and c.get("text") not in rejected_texts
226
+ and float(c.get("score", 0.0)) >= settings.turnaround_min_score
227
+ ]
228
+ if not filtered:
229
+ _log.info(
230
+ "turnaround re-retrieval found no chunks above score floor %.2f β€” "
231
+ "keeping original chunks",
232
+ settings.turnaround_min_score,
233
+ )
234
+ return None
235
+ return filtered[: settings.retrieval_rerank_k]
236
+
237
+
238
  # ── Routes ─────────────────────────────────────────────────────────────────────
239
 
240
 
 
289
  retrieval_mode="none",
290
  latency={},
291
  guardrail_passed=False,
292
+ turn_id=0,
293
  )
294
 
295
  session = _get_or_init_session(req.user_id)
 
300
  # Persist updated session state
301
  session["session_history"] = result["session_history"]
302
  session["bucket_priors"] = result["bucket_priors"]
303
+ session["last_state"] = result
304
 
305
  # Compute evaluation metrics
306
  affect_emotion = (result.get("affect") or {}).get("emotion", "NEUTRAL")
 
325
  latency=result.get("latency_log") or {},
326
  guardrail_passed=result.get("guardrail_passed", True),
327
  eval_scores=eval_scores,
328
+ turn_id=result["turn_id"],
329
+ )
330
+
331
+
332
+ @app.post("/chat/turnaround", response_model=ChatResponse)
333
+ def chat_turnaround(req: TurnaroundRequest):
334
+ if req.user_id not in _sessions:
335
+ raise HTTPException(status_code=404, detail="no active session")
336
+
337
+ session = _sessions[req.user_id]
338
+ last: PipelineState | None = session.get("last_state")
339
+ if last is None:
340
+ raise HTTPException(status_code=409, detail="no prior turn to rephrase")
341
+
342
+ if req.turn_id is not None and req.turn_id != last["turn_id"]:
343
+ raise HTTPException(status_code=409, detail="stale turn_id")
344
+
345
+ # feedback.run will re-append (partner, aac_user) for this turn, so strip
346
+ # both of those tail entries to avoid duplicating the partner line. The
347
+ # rejected aac_user text is also excluded from the re-plan context this way.
348
+ trimmed_history = list(last.get("session_history") or [])
349
+ if trimmed_history and trimmed_history[-1].get("role") == "aac_user":
350
+ trimmed_history.pop()
351
+ if trimmed_history and trimmed_history[-1].get("role") == "partner":
352
+ trimmed_history.pop()
353
+
354
+ intent_kind = classify_intent_kind(last.get("intent_route"))
355
+
356
+ gen_cfg = dict(last.get("generation_config") or {})
357
+ if intent_kind == "present_state":
358
+ gen_cfg["persona_mod"] = "present_state_retry"
359
+ gen_cfg["tone_tag"] = "[TONE:HONEST_UNCERTAIN]"
360
+ else:
361
+ gen_cfg["persona_mod"] = "reverse_stance"
362
+ gen_cfg.setdefault("tone_tag", "[TONE:CLARIFYING_REPHRASE]")
363
+
364
+ replan_state: PipelineState = dict(last) # type: ignore[assignment]
365
+ replan_state["session_history"] = trimmed_history
366
+ replan_state["generation_config"] = gen_cfg
367
+ replan_state["head_signal"] = req.head_signal or last.get("head_signal")
368
+ replan_state["turnaround_triggered"] = True
369
+ replan_state["latency_log"] = {
370
+ "t_sensing": 0.0,
371
+ "t_intent": 0.0,
372
+ "t_retrieval": 0.0,
373
+ "t_generation": 0.0,
374
+ "t_total": 0.0,
375
+ }
376
+
377
+ # For PERSONAL turnarounds, pull fresh chunks excluding the bucket and
378
+ # exact texts of the rejected response β€” same chunks would just produce
379
+ # the same wrong answer. _re_retrieve_excluding returns None when the
380
+ # fresh batch is no better than what we already had, in which case we
381
+ # keep the original chunks rather than degrade to lower-relevance ones.
382
+ if intent_kind == "memory":
383
+ fresh_chunks = _re_retrieve_excluding(
384
+ query=last["raw_query"],
385
+ user_id=last["user_id"],
386
+ rejected_chunks=last.get("retrieved_chunks") or [],
387
+ )
388
+ if fresh_chunks is not None:
389
+ replan_state["retrieved_chunks"] = fresh_chunks
390
+ replan_state["retrieval_mode_used"] = "turnaround_rebucket"
391
+
392
+ planner_update = planner_node.run_primary(replan_state)
393
+ replan_state.update(planner_update) # type: ignore[typeddict-item]
394
+
395
+ feedback_update = feedback_node.run(replan_state)
396
+ replan_state.update(feedback_update) # type: ignore[typeddict-item]
397
+
398
+ session["session_history"] = replan_state["session_history"]
399
+ session["bucket_priors"] = replan_state["bucket_priors"]
400
+ session["last_state"] = replan_state
401
+
402
+ affect_emotion = (replan_state.get("affect") or {}).get("emotion", "NEUTRAL")
403
+ eval_scores = compute_evals(
404
+ response=replan_state["selected_response"] or "",
405
+ chunks=replan_state.get("retrieved_chunks") or [],
406
+ latency_log=replan_state.get("latency_log") or {},
407
+ affect=affect_emotion,
408
+ gesture_tag=replan_state.get("gesture_tag"),
409
+ gaze_bucket=replan_state.get("gaze_bucket"),
410
+ slo_target=settings.slo_target_s,
411
+ )
412
+
413
+ return ChatResponse(
414
+ user_id=req.user_id,
415
+ query=replan_state["raw_query"],
416
+ response=replan_state["selected_response"] or "",
417
+ affect=affect_emotion,
418
+ llm_tier=replan_state.get("llm_tier_used", "unknown"),
419
+ llm_model=replan_state.get("llm_model_used", "unknown"),
420
+ retrieval_mode=replan_state.get("retrieval_mode_used", "unknown"),
421
+ latency=replan_state.get("latency_log") or {},
422
+ guardrail_passed=replan_state.get("guardrail_passed", True),
423
+ eval_scores=eval_scores,
424
+ turn_id=replan_state["turn_id"],
425
  )
backend/config/settings.py CHANGED
@@ -20,6 +20,10 @@ class Settings(BaseSettings):
20
  retrieval_top_k: int = 5
21
  retrieval_rerank_k: int = 3
22
  retrieval_fast_k: int = 2 # used when affect == FRUSTRATED
 
 
 
 
23
 
24
  # LLM tiers β€” both hit Ollama Cloud via OpenAI-compatible endpoint.
25
  # Same model on both tiers for now; swap one when a larger cloud model
 
20
  retrieval_top_k: int = 5
21
  retrieval_rerank_k: int = 3
22
  retrieval_fast_k: int = 2 # used when affect == FRUSTRATED
23
+ # Minimum cosine score for a chunk to be used in turnaround re-retrieval.
24
+ # Below this, we'd rather fall back to original chunks than serve clearly
25
+ # off-topic memories just to "look different."
26
+ turnaround_min_score: float = 0.45
27
 
28
  # LLM tiers β€” both hit Ollama Cloud via OpenAI-compatible endpoint.
29
  # Same model on both tiers for now; swap one when a larger cloud model
backend/pipeline/intent_kind.py ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Shared intent-kind classification β€” used by retrieval, planner, and the
2
+ turnaround endpoint to decide whether a turn is memory-grounded or about
3
+ the user's present state.
4
+
5
+ Centralized here so all three callers stay in lock-step. If you add a new
6
+ intent type that should bypass retrieval, this is the only file to update.
7
+ """
8
+
9
+
10
+ def classify_intent_kind(route: dict | None) -> str:
11
+ """Return "present_state" if every sub-intent is PRESENT_STATE; "memory" otherwise.
12
+
13
+ Mixed routes fall through to "memory" β€” the memory path is the safer default
14
+ because it still allows the model to use chunks if any are present.
15
+ """
16
+ if not route:
17
+ return "memory"
18
+ sub_intents = route.get("sub_intents") or []
19
+ if not sub_intents:
20
+ return "memory"
21
+ if all(si.get("type") == "PRESENT_STATE" for si in sub_intents):
22
+ return "present_state"
23
+ return "memory"
24
+
25
+
26
+ def is_present_state_only(route: dict | None) -> bool:
27
+ """Convenience wrapper for retrieval node β€” returns True iff route is purely PRESENT_STATE."""
28
+ return classify_intent_kind(route) == "present_state"
backend/pipeline/nodes/feedback.py CHANGED
@@ -43,6 +43,8 @@ def _log_to_jsonl(state: PipelineState, run_id: str) -> None:
43
  "llm_tier": state.get("llm_tier_used", "unknown"),
44
  "retrieval_mode": state.get("retrieval_mode_used", "unknown"),
45
  "affect": affect,
 
 
46
  "guardrail_passed": state.get("guardrail_passed", True),
47
  "num_chunks": len(chunks),
48
  "num_personal": sum(
 
43
  "llm_tier": state.get("llm_tier_used", "unknown"),
44
  "retrieval_mode": state.get("retrieval_mode_used", "unknown"),
45
  "affect": affect,
46
+ "head_signal": state.get("head_signal"),
47
+ "turnaround_triggered": state.get("turnaround_triggered", False),
48
  "guardrail_passed": state.get("guardrail_passed", True),
49
  "num_chunks": len(chunks),
50
  "num_personal": sum(
backend/pipeline/nodes/intent.py CHANGED
@@ -19,11 +19,20 @@ from backend.sensing.bucket_keywords import infer_bucket
19
 
20
  _CLASS_EXEMPLARS: dict[str, list[str]] = {
21
  "PERSONAL": [
22
- "how are you today",
23
  "what is your favourite food",
24
  "tell me about your family",
25
- "what do you do in the mornings",
26
- "did you enjoy the weekend",
 
 
 
 
 
 
 
 
 
 
27
  ],
28
  "CONTEXTUAL": [
29
  "what did you just say",
@@ -47,6 +56,10 @@ _CLASSIFIER_THRESHOLD = (
47
  _CONTEXTUAL_MARGIN_MIN = (
48
  0.08 # CONTEXTUAL must beat runner-up by at least this β€” it over-matches without it
49
  )
 
 
 
 
50
  _MIN_FRAGMENT_WORDS = 3
51
  _MAX_FRAGMENTS = 4
52
 
@@ -150,6 +163,15 @@ def _classify(fragment: str) -> str:
150
  if margin < _CONTEXTUAL_MARGIN_MIN or not has_discourse_marker:
151
  return "PERSONAL"
152
 
 
 
 
 
 
 
 
 
 
153
  return best_cls
154
 
155
 
@@ -193,11 +215,17 @@ def run(state: PipelineState) -> dict:
193
 
194
  air_written = state.get("air_written_text")
195
  if air_written:
 
 
 
 
196
  sub_intents.append(
197
  {
198
- "type": "PERSONAL",
199
  "query": air_written,
200
- "bucket_hint": infer_bucket(air_written),
 
 
201
  "priority": priority,
202
  }
203
  )
 
19
 
20
  _CLASS_EXEMPLARS: dict[str, list[str]] = {
21
  "PERSONAL": [
 
22
  "what is your favourite food",
23
  "tell me about your family",
24
+ "what do you do for work",
25
+ "did you grow up around here",
26
+ "what was your childhood like",
27
+ ],
28
+ "PRESENT_STATE": [
29
+ "how are you feeling today",
30
+ "are you tired right now",
31
+ "what are you doing at the moment",
32
+ "did you sleep well last night",
33
+ "are you in pain today",
34
+ "how is your day going",
35
+ "are you having a good day",
36
  ],
37
  "CONTEXTUAL": [
38
  "what did you just say",
 
56
  _CONTEXTUAL_MARGIN_MIN = (
57
  0.08 # CONTEXTUAL must beat runner-up by at least this β€” it over-matches without it
58
  )
59
+ _PRESENT_STATE_MARGIN_MIN = (
60
+ 0.05 # PRESENT_STATE skips retrieval, so a narrow win against PERSONAL would silently
61
+ # drop persona memories. Require a clear margin before going down that path.
62
+ )
63
  _MIN_FRAGMENT_WORDS = 3
64
  _MAX_FRAGMENTS = 4
65
 
 
163
  if margin < _CONTEXTUAL_MARGIN_MIN or not has_discourse_marker:
164
  return "PERSONAL"
165
 
166
+ # PRESENT_STATE skips retrieval entirely, so a narrow win over PERSONAL
167
+ # would silently drop persona memories with no recovery path. Demote to
168
+ # PERSONAL if the win isn't decisive β€” better to over-retrieve than to
169
+ # answer a personal question with no chunks.
170
+ if best_cls == "PRESENT_STATE":
171
+ margin = best_score - runner_up_score
172
+ if margin < _PRESENT_STATE_MARGIN_MIN:
173
+ return "PERSONAL"
174
+
175
  return best_cls
176
 
177
 
 
215
 
216
  air_written = state.get("air_written_text")
217
  if air_written:
218
+ # Classify the air-written supplement the same way as a normal fragment
219
+ # so a present-tense supplement ("tired") on a present-state question
220
+ # doesn't silently flip the route to PERSONAL and re-enable retrieval.
221
+ air_cls = _classify(air_written)
222
  sub_intents.append(
223
  {
224
+ "type": air_cls,
225
  "query": air_written,
226
+ "bucket_hint": infer_bucket(air_written)
227
+ if air_cls == "PERSONAL"
228
+ else None,
229
  "priority": priority,
230
  }
231
  )
backend/pipeline/nodes/planner.py CHANGED
@@ -6,6 +6,7 @@ import time
6
  from backend.config.settings import settings
7
  from backend.generation.llm_client import active_model, chat_complete
8
  from backend.guardrails.checks import check_output
 
9
  from backend.pipeline.state import PipelineState
10
  from backend.sensing.labels import GESTURE_TO_TAG
11
 
@@ -53,6 +54,11 @@ def _run(state: PipelineState, tier: str) -> dict:
53
  )
54
  gesture_tag = state.get("gesture_tag")
55
  air_written_text = state.get("air_written_text")
 
 
 
 
 
56
  messages = _build_messages(
57
  profile,
58
  chunks,
@@ -62,6 +68,9 @@ def _run(state: PipelineState, tier: str) -> dict:
62
  gen_cfg,
63
  gesture_tag=gesture_tag,
64
  air_written_text=air_written_text,
 
 
 
65
  )
66
 
67
  selected = chat_complete(
@@ -102,6 +111,14 @@ def _resolve_tone_tag(user_id: str, affect: str, default_tag: str) -> str:
102
  return _PERSONA_TONE_OVERRIDES.get(user_id, {}).get(affect, default_tag)
103
 
104
 
 
 
 
 
 
 
 
 
105
  def _build_messages(
106
  profile: dict,
107
  chunks: list[dict],
@@ -111,6 +128,9 @@ def _build_messages(
111
  gen_cfg: dict,
112
  gesture_tag: str | None = None,
113
  air_written_text: str | None = None,
 
 
 
114
  ) -> list[dict]:
115
  # Split into a stable system message (same per persona β€” gets cached by the
116
  # provider) and a turn-specific user message. Anything that changes per
@@ -126,6 +146,9 @@ def _build_messages(
126
  gesture_tag,
127
  air_written_text,
128
  profile["name"],
 
 
 
129
  )
130
  return [
131
  {"role": "system", "content": system_content},
@@ -178,6 +201,24 @@ _PERSONA_MOD_INSTRUCTIONS = {
178
  "suppress_humor": "Be direct and supportive. Suppress humor.",
179
  "baseline": "Use your natural communication style.",
180
  "add_confirmation": "Add a clarifying question or confirmation at the end.",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
181
  }
182
 
183
 
@@ -190,6 +231,10 @@ def _build_user(
190
  gesture_tag: str | None,
191
  air_written_text: str | None,
192
  persona_name: str,
 
 
 
 
193
  ) -> str:
194
  personal_chunks = [c for c in chunks if c.get("source", "personal") == "personal"]
195
  contextual_chunks = [c for c in chunks if c.get("source") == "contextual"]
@@ -231,8 +276,38 @@ def _build_user(
231
  _PERSONA_MOD_INSTRUCTIONS["baseline"],
232
  )
233
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
234
  return f"""\
235
- {tone_tag}{gesture_line}{air_writing_line}
236
  {persona_instruction}
237
 
238
  Personal memories:
 
6
  from backend.config.settings import settings
7
  from backend.generation.llm_client import active_model, chat_complete
8
  from backend.guardrails.checks import check_output
9
+ from backend.pipeline.intent_kind import classify_intent_kind
10
  from backend.pipeline.state import PipelineState
11
  from backend.sensing.labels import GESTURE_TO_TAG
12
 
 
54
  )
55
  gesture_tag = state.get("gesture_tag")
56
  air_written_text = state.get("air_written_text")
57
+ turnaround_triggered = state.get("turnaround_triggered", False)
58
+ rejected_response: str | None = None
59
+ if turnaround_triggered:
60
+ rejected_response = state.get("selected_response")
61
+ intent_kind = classify_intent_kind(state.get("intent_route"))
62
  messages = _build_messages(
63
  profile,
64
  chunks,
 
68
  gen_cfg,
69
  gesture_tag=gesture_tag,
70
  air_written_text=air_written_text,
71
+ rejected_response=rejected_response,
72
+ intent_kind=intent_kind,
73
+ affect=affect,
74
  )
75
 
76
  selected = chat_complete(
 
111
  return _PERSONA_TONE_OVERRIDES.get(user_id, {}).get(affect, default_tag)
112
 
113
 
114
+ _AFFECT_HINTS = {
115
+ "HAPPY": "You currently feel positive β€” light, content, energetic.",
116
+ "FRUSTRATED": "You currently feel frustrated β€” tired, irritable, or off.",
117
+ "SURPRISED": "You currently feel surprised or caught off-guard.",
118
+ "NEUTRAL": "Your current state is unclear from the affect signal.",
119
+ }
120
+
121
+
122
  def _build_messages(
123
  profile: dict,
124
  chunks: list[dict],
 
128
  gen_cfg: dict,
129
  gesture_tag: str | None = None,
130
  air_written_text: str | None = None,
131
+ rejected_response: str | None = None,
132
+ intent_kind: str = "memory",
133
+ affect: str = "NEUTRAL",
134
  ) -> list[dict]:
135
  # Split into a stable system message (same per persona β€” gets cached by the
136
  # provider) and a turn-specific user message. Anything that changes per
 
146
  gesture_tag,
147
  air_written_text,
148
  profile["name"],
149
+ rejected_response=rejected_response,
150
+ intent_kind=intent_kind,
151
+ affect=affect,
152
  )
153
  return [
154
  {"role": "system", "content": system_content},
 
201
  "suppress_humor": "Be direct and supportive. Suppress humor.",
202
  "baseline": "Use your natural communication style.",
203
  "add_confirmation": "Add a clarifying question or confirmation at the end.",
204
+ "turnaround": (
205
+ "Your previous reply missed what you actually meant. Rephrase "
206
+ "more directly β€” change the wording meaningfully, not just "
207
+ "surface tweaks β€” and end with a one-sentence clarifying "
208
+ "question to confirm you're on the right track."
209
+ ),
210
+ "reverse_stance": (
211
+ "Your previous reply was substantively wrong β€” not poorly worded, "
212
+ "but the wrong content. Take a meaningfully different stance using "
213
+ "the available memories or, if none fit, honestly say you don't "
214
+ "know. Do NOT just reword the previous reply."
215
+ ),
216
+ "present_state_retry": (
217
+ "Your previous reply was wrong about your current state. The "
218
+ "affect signal probably misled you. Either flip the emotional "
219
+ "read (if you said 'good', try 'not great') or honestly admit "
220
+ "you're not sure how you feel right now. Do NOT invent details."
221
+ ),
222
  }
223
 
224
 
 
231
  gesture_tag: str | None,
232
  air_written_text: str | None,
233
  persona_name: str,
234
+ *,
235
+ rejected_response: str | None = None,
236
+ intent_kind: str = "memory",
237
+ affect: str = "NEUTRAL",
238
  ) -> str:
239
  personal_chunks = [c for c in chunks if c.get("source", "personal") == "personal"]
240
  contextual_chunks = [c for c in chunks if c.get("source") == "contextual"]
 
276
  _PERSONA_MOD_INSTRUCTIONS["baseline"],
277
  )
278
 
279
+ turnaround_line = ""
280
+ if rejected_response:
281
+ safe_rejected = rejected_response.replace('"', "'").replace("\n", " ")[:300]
282
+ turnaround_line = (
283
+ f"\nYour previous reply (which you need to replace, not repeat): "
284
+ f'"{safe_rejected}"'
285
+ )
286
+
287
+ if intent_kind == "present_state":
288
+ affect_hint = _AFFECT_HINTS.get(affect, _AFFECT_HINTS["NEUTRAL"])
289
+ return f"""\
290
+ {tone_tag}{gesture_line}{air_writing_line}{turnaround_line}
291
+ {persona_instruction}
292
+
293
+ The partner is asking about your present state (right now, today).
294
+ Your autobiographical memories do NOT contain this β€” do not fabricate details from them.
295
+
296
+ Current affect read: {affect}
297
+ {affect_hint}
298
+
299
+ Recent conversation:
300
+ {history_block}
301
+
302
+ Partner just said: {query}
303
+
304
+ Reply as {persona_name} in 1–2 sentences, first person.
305
+ - Ground the answer in the affect read above and recent conversation only.
306
+ - If the affect read is NEUTRAL or doesn't match what you'd say, it's better to say "I'm not sure" or "honestly, I don't really know right now" than to invent.
307
+ - Do NOT use autobiographical facts (job, family, hobbies) unless the partner asked."""
308
+
309
  return f"""\
310
+ {tone_tag}{gesture_line}{air_writing_line}{turnaround_line}
311
  {persona_instruction}
312
 
313
  Personal memories:
backend/pipeline/nodes/retrieval.py CHANGED
@@ -4,6 +4,7 @@ from __future__ import annotations
4
  import time
5
 
6
  from backend.config.settings import settings
 
7
  from backend.pipeline.state import PipelineState, RetrievedChunk, SubIntent
8
  from backend.retrieval.contextual import retrieve_from_history
9
  from backend.retrieval.vector_store import retrieve
@@ -16,6 +17,8 @@ _OPEN_DOMAIN_STUB_TEXT = (
16
  def run_fast(state: PipelineState) -> dict:
17
  """Fast retrieval path for FRUSTRATED affect (k=2, no reranker)."""
18
  t0 = time.perf_counter()
 
 
19
  chunks = _dispatch_all(state, per_intent_k=settings.retrieval_fast_k)
20
  return _build_return(state, chunks, "fast", t0)
21
 
@@ -23,6 +26,8 @@ def run_fast(state: PipelineState) -> dict:
23
  def run_full(state: PipelineState) -> dict:
24
  """Full retrieval path: top_k cosine matches narrowed to rerank_k."""
25
  t0 = time.perf_counter()
 
 
26
  chunks = _dispatch_all(state, per_intent_k=settings.retrieval_rerank_k)
27
  return _build_return(state, chunks, "full", t0)
28
 
@@ -50,6 +55,12 @@ def _dispatch_all(state: PipelineState, per_intent_k: int) -> list[RetrievedChun
50
  merged.extend(_retrieve_contextual(sub, state, per_intent_k))
51
  elif kind == "OPEN_DOMAIN":
52
  merged.extend(_retrieve_open_domain(sub))
 
 
 
 
 
 
53
  else:
54
  merged.extend(_retrieve_personal(sub, state, per_intent_k))
55
 
 
4
  import time
5
 
6
  from backend.config.settings import settings
7
+ from backend.pipeline.intent_kind import is_present_state_only
8
  from backend.pipeline.state import PipelineState, RetrievedChunk, SubIntent
9
  from backend.retrieval.contextual import retrieve_from_history
10
  from backend.retrieval.vector_store import retrieve
 
17
  def run_fast(state: PipelineState) -> dict:
18
  """Fast retrieval path for FRUSTRATED affect (k=2, no reranker)."""
19
  t0 = time.perf_counter()
20
+ if is_present_state_only(state.get("intent_route")):
21
+ return _build_return(state, [], "skipped_present_state", t0)
22
  chunks = _dispatch_all(state, per_intent_k=settings.retrieval_fast_k)
23
  return _build_return(state, chunks, "fast", t0)
24
 
 
26
  def run_full(state: PipelineState) -> dict:
27
  """Full retrieval path: top_k cosine matches narrowed to rerank_k."""
28
  t0 = time.perf_counter()
29
+ if is_present_state_only(state.get("intent_route")):
30
+ return _build_return(state, [], "skipped_present_state", t0)
31
  chunks = _dispatch_all(state, per_intent_k=settings.retrieval_rerank_k)
32
  return _build_return(state, chunks, "full", t0)
33
 
 
55
  merged.extend(_retrieve_contextual(sub, state, per_intent_k))
56
  elif kind == "OPEN_DOMAIN":
57
  merged.extend(_retrieve_open_domain(sub))
58
+ elif kind == "PRESENT_STATE":
59
+ # PRESENT_STATE is grounded in the affect signal, not memory.
60
+ # In a pure-present-state route the run_fast/run_full early skip
61
+ # already short-circuits us; in a mixed route we just contribute
62
+ # nothing here so the planner doesn't see misleading chunks.
63
+ continue
64
  else:
65
  merged.extend(_retrieve_personal(sub, state, per_intent_k))
66
 
backend/pipeline/state.py CHANGED
@@ -31,7 +31,7 @@ class RetrievedChunk(TypedDict):
31
 
32
 
33
  class SubIntent(TypedDict):
34
- type: str # "PERSONAL" | "CONTEXTUAL" | "OPEN_DOMAIN"
35
  query: str
36
  bucket_hint: str | None
37
  priority: str # "fast" | "normal"
@@ -47,9 +47,10 @@ class GenerationConfig(TypedDict):
47
  max_tokens: int
48
  tone_tag: str # e.g. "[TONE:WITTY_SARCASTIC]"
49
  retrieval_mode: str # "fast" | "full"
50
- persona_mod: (
51
- str # "amplify_quirks" | "suppress_humor" | "baseline" | "add_confirmation"
52
- )
 
53
 
54
 
55
  class LatencyLog(TypedDict):
@@ -75,6 +76,8 @@ class PipelineState(TypedDict):
75
  gesture_tag: str | None # e.g. "THUMBS_UP"
76
  gaze_bucket: str | None # bucket hinted by gaze fixation
77
  air_written_text: str | None # concatenated air-written chars
 
 
78
 
79
  # ── L2: Intent decomposition outputs ─────────────────────────────────────
80
  raw_query: str # partner's typed/spoken query
 
31
 
32
 
33
  class SubIntent(TypedDict):
34
+ type: str # "PERSONAL" | "CONTEXTUAL" | "PRESENT_STATE" | "OPEN_DOMAIN"
35
  query: str
36
  bucket_hint: str | None
37
  priority: str # "fast" | "normal"
 
47
  max_tokens: int
48
  tone_tag: str # e.g. "[TONE:WITTY_SARCASTIC]"
49
  retrieval_mode: str # "fast" | "full"
50
+ persona_mod: str
51
+ # persona_mod values:
52
+ # "amplify_quirks" | "suppress_humor" | "baseline"
53
+ # | "add_confirmation" | "turnaround"
54
 
55
 
56
  class LatencyLog(TypedDict):
 
76
  gesture_tag: str | None # e.g. "THUMBS_UP"
77
  gaze_bucket: str | None # bucket hinted by gaze fixation
78
  air_written_text: str | None # concatenated air-written chars
79
+ head_signal: str | None # "HEAD_SHAKE" | "HEAD_NOD_DISSATISFIED"
80
+ turnaround_triggered: bool # true when re-planned from dissatisfaction signal
81
 
82
  # ── L2: Intent decomposition outputs ─────────────────────────────────────
83
  raw_query: str # partner's typed/spoken query
frontend/src/App.css CHANGED
@@ -343,6 +343,75 @@ input[type="text"]:hover {
343
  font-size: 13px;
344
  }
345
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
346
  .eval-panel {
347
  margin-top: 10px;
348
  border-top: 1px solid var(--border);
 
343
  font-size: 13px;
344
  }
345
 
346
+ .sensing-debug {
347
+ font-family: var(--mono, monospace);
348
+ font-size: 11px;
349
+ opacity: 0.7;
350
+ }
351
+
352
+ .calibrate-btn {
353
+ margin-top: 8px;
354
+ width: 100%;
355
+ padding: 8px 12px;
356
+ font-size: 13px;
357
+ font-family: var(--sans);
358
+ background: transparent;
359
+ color: var(--accent);
360
+ border: 1px solid var(--accent);
361
+ border-radius: 8px;
362
+ cursor: pointer;
363
+ }
364
+
365
+ .calibrate-btn:disabled {
366
+ opacity: 0.45;
367
+ cursor: not-allowed;
368
+ }
369
+
370
+ .calibrate-btn:hover:not(:disabled) {
371
+ background: var(--accent);
372
+ color: #ffffff;
373
+ }
374
+
375
+ .chat-bubble.rephrased p {
376
+ opacity: 0.6;
377
+ text-decoration: line-through;
378
+ }
379
+
380
+ .chat-bubble.turnaround {
381
+ border-left: 3px solid var(--accent);
382
+ }
383
+
384
+ .badge {
385
+ display: inline-block;
386
+ margin-left: 6px;
387
+ padding: 1px 6px;
388
+ font-size: 11px;
389
+ font-weight: 500;
390
+ border-radius: 10px;
391
+ vertical-align: middle;
392
+ }
393
+
394
+ .badge-rephrased {
395
+ background: rgba(0, 0, 0, 0.08);
396
+ color: var(--fail);
397
+ }
398
+
399
+ .badge-turnaround {
400
+ background: var(--accent);
401
+ color: #ffffff;
402
+ }
403
+
404
+ .turnaround-btn {
405
+ background: transparent !important;
406
+ color: var(--accent) !important;
407
+ border: 1px solid var(--accent) !important;
408
+ }
409
+
410
+ .turnaround-btn:hover:not(:disabled) {
411
+ background: var(--accent) !important;
412
+ color: #ffffff !important;
413
+ }
414
+
415
  .eval-panel {
416
  margin-top: 10px;
417
  border-top: 1px solid var(--border);
frontend/src/App.tsx CHANGED
@@ -32,8 +32,17 @@ function App() {
32
  return () => clearInterval(healthPoll.current);
33
  }, []);
34
 
35
- const { sensing, ready, initError, init, processFrame, clearAirWrittenText, resetCalibration } =
36
- useSensing();
 
 
 
 
 
 
 
 
 
37
 
38
  const onFrame = useCallback(
39
  (video: HTMLVideoElement, timestamp: number) => {
@@ -89,6 +98,16 @@ function App() {
89
  </label>
90
  <WebcamSensing videoRef={videoRef} active={active} error={error || initError} />
91
  <SensingStatus sensing={sensing} webcamActive={active} />
 
 
 
 
 
 
 
 
 
 
92
  </div>
93
 
94
  <div className="sidebar-section">
@@ -120,6 +139,7 @@ function App() {
120
  sensing={sensing}
121
  affectOverride={affectOverride}
122
  onAirTextConsumed={clearAirWrittenText}
 
123
  messages={messages}
124
  setMessages={setMessages}
125
  onLatency={setLatency}
 
32
  return () => clearInterval(healthPoll.current);
33
  }, []);
34
 
35
+ const {
36
+ sensing,
37
+ ready,
38
+ initError,
39
+ init,
40
+ processFrame,
41
+ clearAirWrittenText,
42
+ clearHeadSignal,
43
+ calibrateHeadPose,
44
+ resetCalibration,
45
+ } = useSensing();
46
 
47
  const onFrame = useCallback(
48
  (video: HTMLVideoElement, timestamp: number) => {
 
98
  </label>
99
  <WebcamSensing videoRef={videoRef} active={active} error={error || initError} />
100
  <SensingStatus sensing={sensing} webcamActive={active} />
101
+ <button
102
+ type="button"
103
+ className="calibrate-btn"
104
+ disabled={!active}
105
+ onClick={() => calibrateHeadPose()}
106
+ >
107
+ {sensing.headCalibrated
108
+ ? "Re-calibrate head pose"
109
+ : "Calibrate head pose"}
110
+ </button>
111
  </div>
112
 
113
  <div className="sidebar-section">
 
139
  sensing={sensing}
140
  affectOverride={affectOverride}
141
  onAirTextConsumed={clearAirWrittenText}
142
+ onHeadSignalConsumed={clearHeadSignal}
143
  messages={messages}
144
  setMessages={setMessages}
145
  onLatency={setLatency}
frontend/src/components/ChatPanel.tsx CHANGED
@@ -1,6 +1,6 @@
1
- import { useState, useRef, useEffect } from "react";
2
  import type { ChatMessage, SensingState, Affect, LatencyLog } from "../types";
3
- import { sendChat } from "../lib/api";
4
  import { EvalPanel } from "./EvalPanel";
5
 
6
  interface Props {
@@ -9,18 +9,22 @@ interface Props {
9
  sensing: SensingState;
10
  affectOverride: Affect | null;
11
  onAirTextConsumed: () => void;
 
12
  messages: ChatMessage[];
13
  setMessages: React.Dispatch<React.SetStateAction<ChatMessage[]>>;
14
  onLatency: (latency: LatencyLog) => void;
15
  backendReady: boolean;
16
  }
17
 
 
 
18
  export function ChatPanel({
19
  userId,
20
  personaName,
21
  sensing,
22
  affectOverride,
23
  onAirTextConsumed,
 
24
  messages,
25
  setMessages,
26
  onLatency,
@@ -28,12 +32,120 @@ export function ChatPanel({
28
  }: Props) {
29
  const [input, setInput] = useState("");
30
  const [loading, setLoading] = useState(false);
 
31
  const bottomRef = useRef<HTMLDivElement>(null);
 
 
 
 
 
 
32
 
33
  useEffect(() => {
34
  bottomRef.current?.scrollIntoView({ behavior: "smooth" });
35
  }, [messages]);
36
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  async function handleSend() {
38
  if (!input.trim() || !userId || !backendReady || loading) return;
39
 
@@ -51,8 +163,10 @@ export function ChatPanel({
51
  gesture_tag: sensing.gestureTag,
52
  gaze_bucket: sensing.gazeBucket,
53
  air_written_text: airText,
 
54
  });
55
 
 
56
  setMessages((prev) => [
57
  ...prev,
58
  {
@@ -61,9 +175,11 @@ export function ChatPanel({
61
  latency: res.latency,
62
  affect: res.affect,
63
  evalScores: res.eval_scores,
 
64
  },
65
  ]);
66
  onLatency(res.latency);
 
67
  } catch (e) {
68
  setMessages((prev) => [
69
  ...prev,
@@ -78,6 +194,13 @@ export function ChatPanel({
78
  }
79
  }
80
 
 
 
 
 
 
 
 
81
  return (
82
  <div className="chat-panel">
83
  <div className="chat-header">
@@ -85,9 +208,20 @@ export function ChatPanel({
85
  </div>
86
  <div className="chat-messages">
87
  {messages.map((msg, i) => (
88
- <div key={i} className={`chat-bubble ${msg.role}`}>
 
 
 
 
 
89
  <span className="chat-role">
90
  {msg.role === "partner" ? "Partner" : "AAC User"}
 
 
 
 
 
 
91
  </span>
92
  <p>{msg.content}</p>
93
  {msg.role === "aac_user" && msg.evalScores && (
@@ -101,6 +235,12 @@ export function ChatPanel({
101
  <p>Generating...</p>
102
  </div>
103
  )}
 
 
 
 
 
 
104
  <div ref={bottomRef} />
105
  </div>
106
  <div className="chat-input-row">
@@ -115,6 +255,15 @@ export function ChatPanel({
115
  <button onClick={handleSend} disabled={!userId || loading || !backendReady || !input.trim()}>
116
  Send
117
  </button>
 
 
 
 
 
 
 
 
 
118
  </div>
119
  </div>
120
  );
 
1
+ import { useState, useRef, useEffect, useCallback } from "react";
2
  import type { ChatMessage, SensingState, Affect, LatencyLog } from "../types";
3
+ import { sendChat, sendTurnaround } from "../lib/api";
4
  import { EvalPanel } from "./EvalPanel";
5
 
6
  interface Props {
 
9
  sensing: SensingState;
10
  affectOverride: Affect | null;
11
  onAirTextConsumed: () => void;
12
+ onHeadSignalConsumed: () => void;
13
  messages: ChatMessage[];
14
  setMessages: React.Dispatch<React.SetStateAction<ChatMessage[]>>;
15
  onLatency: (latency: LatencyLog) => void;
16
  backendReady: boolean;
17
  }
18
 
19
+ const TURNAROUND_WINDOW_MS = 5000;
20
+
21
  export function ChatPanel({
22
  userId,
23
  personaName,
24
  sensing,
25
  affectOverride,
26
  onAirTextConsumed,
27
+ onHeadSignalConsumed,
28
  messages,
29
  setMessages,
30
  onLatency,
 
32
  }: Props) {
33
  const [input, setInput] = useState("");
34
  const [loading, setLoading] = useState(false);
35
+ const [turnaroundLoading, setTurnaroundLoading] = useState(false);
36
  const bottomRef = useRef<HTMLDivElement>(null);
37
+ const lastResponseTsRef = useRef<number>(0);
38
+ const lastTurnIdRef = useRef<number | null>(null);
39
+ // turn_id of the most recent turn that was already turned around β€” guards
40
+ // against the new turnaround bubble's own head-signal re-firing turnaround
41
+ // on itself.
42
+ const turnaroundConsumedTurnRef = useRef<number | null>(null);
43
 
44
  useEffect(() => {
45
  bottomRef.current?.scrollIntoView({ behavior: "smooth" });
46
  }, [messages]);
47
 
48
+ // Reset per-turn state when the persona changes (parent clears `messages`
49
+ // and resets the backend session β€” the frontend turn counter must follow).
50
+ useEffect(() => {
51
+ lastTurnIdRef.current = null;
52
+ turnaroundConsumedTurnRef.current = null;
53
+ lastResponseTsRef.current = 0;
54
+ }, [userId]);
55
+
56
+ const handleTurnaround = useCallback(
57
+ async (reason: "head" | "manual") => {
58
+ if (!userId || !backendReady || turnaroundLoading || loading) return;
59
+ const targetTurnId = lastTurnIdRef.current;
60
+ if (targetTurnId === null) return;
61
+ if (turnaroundConsumedTurnRef.current === targetTurnId) return;
62
+
63
+ turnaroundConsumedTurnRef.current = targetTurnId;
64
+ setTurnaroundLoading(true);
65
+ try {
66
+ const res = await sendTurnaround({
67
+ user_id: userId,
68
+ turn_id: targetTurnId,
69
+ head_signal: reason === "head" ? sensing.headSignal : null,
70
+ });
71
+
72
+ lastTurnIdRef.current = res.turn_id;
73
+ turnaroundConsumedTurnRef.current = res.turn_id;
74
+
75
+ setMessages((prev) => {
76
+ const next = [...prev];
77
+ for (let i = next.length - 1; i >= 0; i--) {
78
+ if (next[i].role === "aac_user" && !next[i].isTurnaround) {
79
+ next[i] = { ...next[i], rephrased: true };
80
+ break;
81
+ }
82
+ }
83
+ next.push({
84
+ role: "aac_user",
85
+ content: res.response,
86
+ latency: res.latency,
87
+ affect: res.affect,
88
+ evalScores: res.eval_scores,
89
+ turnId: res.turn_id,
90
+ isTurnaround: true,
91
+ });
92
+ return next;
93
+ });
94
+ onLatency(res.latency);
95
+ // Do NOT advance lastResponseTsRef β€” keep the original turn's window so
96
+ // the user can't head-shake the turnaround itself into another loop.
97
+ } catch (e) {
98
+ setMessages((prev) => [
99
+ ...prev,
100
+ {
101
+ role: "aac_user",
102
+ content: `Error rephrasing: ${
103
+ e instanceof Error ? e.message : "request failed"
104
+ }`,
105
+ isTurnaround: true,
106
+ },
107
+ ]);
108
+ } finally {
109
+ if (reason === "head") onHeadSignalConsumed();
110
+ setTurnaroundLoading(false);
111
+ }
112
+ },
113
+ [
114
+ userId,
115
+ backendReady,
116
+ turnaroundLoading,
117
+ loading,
118
+ sensing.headSignal,
119
+ setMessages,
120
+ onLatency,
121
+ onHeadSignalConsumed,
122
+ ]
123
+ );
124
+
125
+ useEffect(() => {
126
+ if (
127
+ sensing.headSignal !== "HEAD_NOD_DISSATISFIED" &&
128
+ sensing.headSignal !== "HEAD_SHAKE"
129
+ ) {
130
+ return;
131
+ }
132
+ const targetTurnId = lastTurnIdRef.current;
133
+ const eligible =
134
+ targetTurnId !== null &&
135
+ turnaroundConsumedTurnRef.current !== targetTurnId &&
136
+ lastResponseTsRef.current > 0 &&
137
+ performance.now() - lastResponseTsRef.current <= TURNAROUND_WINDOW_MS;
138
+
139
+ if (eligible) {
140
+ handleTurnaround("head");
141
+ return;
142
+ }
143
+ // Not eligible β€” keep the chip visible briefly so the user can see that
144
+ // detection fired, then clear it. (Instant clear made detection invisible.)
145
+ const id = window.setTimeout(() => onHeadSignalConsumed(), 1500);
146
+ return () => window.clearTimeout(id);
147
+ }, [sensing.headSignal, handleTurnaround, onHeadSignalConsumed]);
148
+
149
  async function handleSend() {
150
  if (!input.trim() || !userId || !backendReady || loading) return;
151
 
 
163
  gesture_tag: sensing.gestureTag,
164
  gaze_bucket: sensing.gazeBucket,
165
  air_written_text: airText,
166
+ head_signal: sensing.headSignal,
167
  });
168
 
169
+ lastTurnIdRef.current = res.turn_id;
170
  setMessages((prev) => [
171
  ...prev,
172
  {
 
175
  latency: res.latency,
176
  affect: res.affect,
177
  evalScores: res.eval_scores,
178
+ turnId: res.turn_id,
179
  },
180
  ]);
181
  onLatency(res.latency);
182
+ lastResponseTsRef.current = performance.now();
183
  } catch (e) {
184
  setMessages((prev) => [
185
  ...prev,
 
194
  }
195
  }
196
 
197
+ const canTurnaround =
198
+ !!userId &&
199
+ backendReady &&
200
+ !loading &&
201
+ !turnaroundLoading &&
202
+ lastTurnIdRef.current !== null;
203
+
204
  return (
205
  <div className="chat-panel">
206
  <div className="chat-header">
 
208
  </div>
209
  <div className="chat-messages">
210
  {messages.map((msg, i) => (
211
+ <div
212
+ key={i}
213
+ className={`chat-bubble ${msg.role}${
214
+ msg.rephrased ? " rephrased" : ""
215
+ }${msg.isTurnaround ? " turnaround" : ""}`}
216
+ >
217
  <span className="chat-role">
218
  {msg.role === "partner" ? "Partner" : "AAC User"}
219
+ {msg.rephrased && (
220
+ <span className="badge badge-rephrased"> rephrased</span>
221
+ )}
222
+ {msg.isTurnaround && (
223
+ <span className="badge badge-turnaround"> ↻ turnaround</span>
224
+ )}
225
  </span>
226
  <p>{msg.content}</p>
227
  {msg.role === "aac_user" && msg.evalScores && (
 
235
  <p>Generating...</p>
236
  </div>
237
  )}
238
+ {turnaroundLoading && (
239
+ <div className="chat-bubble aac_user loading">
240
+ <span className="chat-role">AAC User</span>
241
+ <p>↻ Rephrasing...</p>
242
+ </div>
243
+ )}
244
  <div ref={bottomRef} />
245
  </div>
246
  <div className="chat-input-row">
 
255
  <button onClick={handleSend} disabled={!userId || loading || !backendReady || !input.trim()}>
256
  Send
257
  </button>
258
+ <button
259
+ type="button"
260
+ className="turnaround-btn"
261
+ onClick={() => handleTurnaround("manual")}
262
+ disabled={!canTurnaround}
263
+ title="Re-plan the last response (also triggered by a head shake / sharp nod)"
264
+ >
265
+ ↻ Not quite right
266
+ </button>
267
  </div>
268
  </div>
269
  );
frontend/src/components/SensingStatus.tsx CHANGED
@@ -38,6 +38,26 @@ export function SensingStatus({ sensing, webcamActive }: Props) {
38
  {sensing.gazeBucket ?? "none"}
39
  </span>
40
  </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
  {sensing.airWrittenText && (
42
  <div className="sensing-row">
43
  <span className="sensing-label">Air-written</span>
 
38
  {sensing.gazeBucket ?? "none"}
39
  </span>
40
  </div>
41
+ <div className="sensing-row">
42
+ <span className="sensing-label">Head</span>
43
+ <span className="sensing-value">
44
+ {sensing.headCalibrated
45
+ ? sensing.headSignal ?? "steady"
46
+ : "not calibrated"}
47
+ </span>
48
+ </div>
49
+ {sensing.headCalibrated && (
50
+ <div className="sensing-row sensing-debug">
51
+ <span className="sensing-label"> ↳ Ξ”x/Ξ”y</span>
52
+ <span className="sensing-value">
53
+ {sensing.headDebug.dx.toFixed(3)} / {sensing.headDebug.dy.toFixed(3)}
54
+ {" "}
55
+ (peak {sensing.headDebug.maxAbsDx.toFixed(3)}/
56
+ {sensing.headDebug.maxAbsDy.toFixed(3)},{" "}
57
+ x{sensing.headDebug.crossings})
58
+ </span>
59
+ </div>
60
+ )}
61
  {sensing.airWrittenText && (
62
  <div className="sensing-row">
63
  <span className="sensing-label">Air-written</span>
frontend/src/hooks/useSensing.ts CHANGED
@@ -11,6 +11,7 @@ import {
11
  classifyGesture,
12
  GazeTracker,
13
  AirWriter,
 
14
  } from "../lib/sensing";
15
 
16
  const EMA_ALPHA = 0.3;
@@ -20,6 +21,9 @@ export function useSensing() {
20
  const handLandmarkerRef = useRef<HandLandmarker | null>(null);
21
  const gazeTrackerRef = useRef(new GazeTracker());
22
  const airWriterRef = useRef(new AirWriter());
 
 
 
23
  const neutralLCPRef = useRef<number | null>(null);
24
  const smoothedRef = useRef({ MAR: 0, EAR: 0.3, BRI: -0.3, LCP: 0 });
25
  const initingRef = useRef(false);
@@ -30,6 +34,9 @@ export function useSensing() {
30
  gestureTag: null,
31
  gazeBucket: null,
32
  airWrittenText: "",
 
 
 
33
  });
34
 
35
  // Cleanup MediaPipe resources on unmount
@@ -95,6 +102,7 @@ export function useSensing() {
95
 
96
  let affect: SensingState["affect"] = null;
97
  let gazeBucket: SensingState["gazeBucket"] = null;
 
98
 
99
  const faceResult = faceLandmarker.detectForVideo(video, timestamp);
100
  if (faceResult.faceLandmarks && faceResult.faceLandmarks.length > 0) {
@@ -105,6 +113,11 @@ export function useSensing() {
105
  (landmarks[61].x + landmarks[291].x) / 2;
106
  }
107
 
 
 
 
 
 
108
  const raw = computeAffectVector(landmarks, neutralLCPRef.current);
109
 
110
  const prev = smoothedRef.current;
@@ -118,6 +131,8 @@ export function useSensing() {
118
 
119
  affect = classifyAffect(smoothed);
120
  gazeBucket = gazeTrackerRef.current.process(landmarks);
 
 
121
  }
122
 
123
  let gestureTag: SensingState["gestureTag"] = null;
@@ -144,6 +159,9 @@ export function useSensing() {
144
  airWrittenText: newAirText
145
  ? prev.airWrittenText + newAirText
146
  : prev.airWrittenText,
 
 
 
147
  }));
148
  },
149
  []
@@ -153,12 +171,40 @@ export function useSensing() {
153
  setSensing((prev) => ({ ...prev, airWrittenText: "" }));
154
  }, []);
155
 
 
 
 
 
 
 
 
 
 
156
  const resetCalibration = useCallback(() => {
157
  neutralLCPRef.current = null;
158
  smoothedRef.current = { MAR: 0, EAR: 0.3, BRI: -0.3, LCP: 0 };
159
  gazeTrackerRef.current.reset();
160
- setSensing({ affect: null, gestureTag: null, gazeBucket: null, airWrittenText: "" });
 
 
 
 
 
 
 
 
 
161
  }, []);
162
 
163
- return { sensing, ready, initError, init, processFrame, clearAirWrittenText, resetCalibration };
 
 
 
 
 
 
 
 
 
 
164
  }
 
11
  classifyGesture,
12
  GazeTracker,
13
  AirWriter,
14
+ HeadPoseTracker,
15
  } from "../lib/sensing";
16
 
17
  const EMA_ALPHA = 0.3;
 
21
  const handLandmarkerRef = useRef<HandLandmarker | null>(null);
22
  const gazeTrackerRef = useRef(new GazeTracker());
23
  const airWriterRef = useRef(new AirWriter());
24
+ const headTrackerRef = useRef(new HeadPoseTracker());
25
+ const calibratePendingRef = useRef(false);
26
+ const headDebugRef = useRef({ dx: 0, dy: 0, maxAbsDx: 0, maxAbsDy: 0, crossings: 0 });
27
  const neutralLCPRef = useRef<number | null>(null);
28
  const smoothedRef = useRef({ MAR: 0, EAR: 0.3, BRI: -0.3, LCP: 0 });
29
  const initingRef = useRef(false);
 
34
  gestureTag: null,
35
  gazeBucket: null,
36
  airWrittenText: "",
37
+ headSignal: null,
38
+ headCalibrated: false,
39
+ headDebug: { dx: 0, dy: 0, maxAbsDx: 0, maxAbsDy: 0, crossings: 0 },
40
  });
41
 
42
  // Cleanup MediaPipe resources on unmount
 
102
 
103
  let affect: SensingState["affect"] = null;
104
  let gazeBucket: SensingState["gazeBucket"] = null;
105
+ let headSignal: SensingState["headSignal"] = null;
106
 
107
  const faceResult = faceLandmarker.detectForVideo(video, timestamp);
108
  if (faceResult.faceLandmarks && faceResult.faceLandmarks.length > 0) {
 
113
  (landmarks[61].x + landmarks[291].x) / 2;
114
  }
115
 
116
+ if (calibratePendingRef.current) {
117
+ headTrackerRef.current.calibrate(landmarks);
118
+ calibratePendingRef.current = false;
119
+ }
120
+
121
  const raw = computeAffectVector(landmarks, neutralLCPRef.current);
122
 
123
  const prev = smoothedRef.current;
 
131
 
132
  affect = classifyAffect(smoothed);
133
  gazeBucket = gazeTrackerRef.current.process(landmarks);
134
+ headSignal = headTrackerRef.current.process(landmarks);
135
+ headDebugRef.current = headTrackerRef.current.debug;
136
  }
137
 
138
  let gestureTag: SensingState["gestureTag"] = null;
 
159
  airWrittenText: newAirText
160
  ? prev.airWrittenText + newAirText
161
  : prev.airWrittenText,
162
+ headSignal: headSignal ?? prev.headSignal,
163
+ headCalibrated: headTrackerRef.current.calibrated,
164
+ headDebug: headDebugRef.current,
165
  }));
166
  },
167
  []
 
171
  setSensing((prev) => ({ ...prev, airWrittenText: "" }));
172
  }, []);
173
 
174
+ const clearHeadSignal = useCallback(() => {
175
+ setSensing((prev) => ({ ...prev, headSignal: null }));
176
+ }, []);
177
+
178
+ const calibrateHeadPose = useCallback(() => {
179
+ calibratePendingRef.current = true;
180
+ setSensing((prev) => ({ ...prev, headSignal: null }));
181
+ }, []);
182
+
183
  const resetCalibration = useCallback(() => {
184
  neutralLCPRef.current = null;
185
  smoothedRef.current = { MAR: 0, EAR: 0.3, BRI: -0.3, LCP: 0 };
186
  gazeTrackerRef.current.reset();
187
+ headTrackerRef.current.reset();
188
+ setSensing({
189
+ affect: null,
190
+ gestureTag: null,
191
+ gazeBucket: null,
192
+ airWrittenText: "",
193
+ headSignal: null,
194
+ headCalibrated: false,
195
+ headDebug: { dx: 0, dy: 0, maxAbsDx: 0, maxAbsDy: 0, crossings: 0 },
196
+ });
197
  }, []);
198
 
199
+ return {
200
+ sensing,
201
+ ready,
202
+ initError,
203
+ init,
204
+ processFrame,
205
+ clearAirWrittenText,
206
+ clearHeadSignal,
207
+ calibrateHeadPose,
208
+ resetCalibration,
209
+ };
210
  }
frontend/src/lib/api.ts CHANGED
@@ -1,4 +1,9 @@
1
- import type { ChatRequest, ChatResponse, Persona } from "../types";
 
 
 
 
 
2
 
3
  const API_BASE = "";
4
 
@@ -19,6 +24,18 @@ export async function sendChat(req: ChatRequest): Promise<ChatResponse> {
19
  return res.json();
20
  }
21
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  export async function resetSession(userId: string): Promise<void> {
23
  const res = await fetch(
24
  `${API_BASE}/session/reset?user_id=${encodeURIComponent(userId)}`,
 
1
+ import type {
2
+ ChatRequest,
3
+ ChatResponse,
4
+ Persona,
5
+ TurnaroundRequest,
6
+ } from "../types";
7
 
8
  const API_BASE = "";
9
 
 
24
  return res.json();
25
  }
26
 
27
+ export async function sendTurnaround(
28
+ req: TurnaroundRequest
29
+ ): Promise<ChatResponse> {
30
+ const res = await fetch(`${API_BASE}/chat/turnaround`, {
31
+ method: "POST",
32
+ headers: { "Content-Type": "application/json" },
33
+ body: JSON.stringify(req),
34
+ });
35
+ if (!res.ok) throw new Error(`API error: ${res.status}`);
36
+ return res.json();
37
+ }
38
+
39
  export async function resetSession(userId: string): Promise<void> {
40
  const res = await fetch(
41
  `${API_BASE}/session/reset?user_id=${encodeURIComponent(userId)}`,
frontend/src/lib/sensing.ts CHANGED
@@ -189,6 +189,227 @@ export class GazeTracker {
189
  }
190
  }
191
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
192
  // ── Air-writing DTW (ported from backend/sensing/air_writing.py) ─────────────
193
 
194
  const INDEX_TIP = 8;
 
189
  }
190
  }
191
 
192
+ // ── Head-pose tracker (shake / sharp-nod-dissatisfied) ──────────────────────
193
+
194
+ export type HeadSignal = "HEAD_SHAKE" | "HEAD_NOD_DISSATISFIED";
195
+
196
+ const NOSE_TIP = 1;
197
+
198
+ interface NosePoint {
199
+ x: number;
200
+ y: number;
201
+ t: number;
202
+ }
203
+
204
+ export interface HeadDebug {
205
+ dx: number; // current x displacement from neutral
206
+ dy: number; // current y displacement from neutral
207
+ maxAbsDx: number; // peak |dx| within the window
208
+ maxAbsDy: number; // peak |dy| within the window
209
+ crossings: number; // side crossings within the window (deadband-filtered)
210
+ }
211
+
212
+ export class HeadPoseTracker {
213
+ private neutralX: number | null = null;
214
+ private neutralY: number | null = null;
215
+ private history: NosePoint[] = [];
216
+ private lastEmitTs = 0;
217
+ private lastDebug: HeadDebug = {
218
+ dx: 0,
219
+ dy: 0,
220
+ maxAbsDx: 0,
221
+ maxAbsDy: 0,
222
+ crossings: 0,
223
+ };
224
+
225
+ private static WINDOW_MS = 1000;
226
+ private static REFRACTORY_MS = 2000;
227
+ private static SHAKE_AMPLITUDE = 0.015;
228
+ private static SHAKE_MIN_CROSSINGS = 3;
229
+ // Per-frame jitter below this magnitude is ignored when counting side
230
+ // crossings, so micro-fidgets near neutral can't rack up false crossings.
231
+ private static SHAKE_DEADBAND = 0.005;
232
+ private static NOD_DROP = 0.06;
233
+ private static NOD_WINDOW_MS = 600;
234
+ // Reject "nod" when horizontal motion exceeds this β€” it's a shake/sway.
235
+ private static NOD_MAX_HORIZONTAL = 0.015;
236
+ // Recovery: head must come back to within this of neutral.
237
+ private static NOD_RECOVERY = 0.015;
238
+ // The drop must start from near-neutral (not from a tilted resting pose).
239
+ private static NOD_START_THRESHOLD = 0.015;
240
+ // Minimum frames between drop start and peak β€” guards against single-frame
241
+ // landmark glitches that look like an instantaneous jerk.
242
+ private static NOD_MIN_DROP_FRAMES = 3;
243
+ // Minimum frames between peak and recovery β€” same reason, going up.
244
+ private static NOD_MIN_RECOVERY_FRAMES = 2;
245
+
246
+ calibrate(landmarks: { x: number; y: number }[]): void {
247
+ if (!landmarks[NOSE_TIP]) return;
248
+ this.neutralX = landmarks[NOSE_TIP].x;
249
+ this.neutralY = landmarks[NOSE_TIP].y;
250
+ this.history = [];
251
+ this.lastEmitTs = 0;
252
+ }
253
+
254
+ process(landmarks: { x: number; y: number }[]): HeadSignal | null {
255
+ if (!landmarks[NOSE_TIP]) return null;
256
+ if (this.neutralX === null || this.neutralY === null) return null;
257
+
258
+ const now = performance.now();
259
+ const nose = landmarks[NOSE_TIP];
260
+ this.history.push({ x: nose.x, y: nose.y, t: now });
261
+ const cutoff = now - HeadPoseTracker.WINDOW_MS;
262
+ this.history = this.history.filter((p) => p.t >= cutoff);
263
+
264
+ this.updateDebug(nose);
265
+
266
+ if (now - this.lastEmitTs < HeadPoseTracker.REFRACTORY_MS) return null;
267
+ if (this.history.length < 6) return null;
268
+
269
+ const shake = this.detectShake();
270
+ if (shake) {
271
+ this.lastEmitTs = now;
272
+ return shake;
273
+ }
274
+
275
+ const nod = this.detectNod(now);
276
+ if (nod) {
277
+ this.lastEmitTs = now;
278
+ return nod;
279
+ }
280
+
281
+ return null;
282
+ }
283
+
284
+ private updateDebug(nose: { x: number; y: number }): void {
285
+ if (this.neutralX === null || this.neutralY === null) return;
286
+ let maxAbsDx = 0;
287
+ let maxAbsDy = 0;
288
+ let crossings = 0;
289
+ let prevSide = 0;
290
+ for (const p of this.history) {
291
+ const dx = p.x - this.neutralX;
292
+ const dy = p.y - this.neutralY;
293
+ const absDx = Math.abs(dx);
294
+ maxAbsDx = Math.max(maxAbsDx, absDx);
295
+ maxAbsDy = Math.max(maxAbsDy, Math.abs(dy));
296
+ if (absDx < HeadPoseTracker.SHAKE_DEADBAND) continue;
297
+ const side = dx > 0 ? 1 : -1;
298
+ if (prevSide !== 0 && side !== prevSide) crossings += 1;
299
+ prevSide = side;
300
+ }
301
+ this.lastDebug = {
302
+ dx: nose.x - this.neutralX,
303
+ dy: nose.y - this.neutralY,
304
+ maxAbsDx,
305
+ maxAbsDy,
306
+ crossings,
307
+ };
308
+ }
309
+
310
+ get debug(): HeadDebug {
311
+ return this.lastDebug;
312
+ }
313
+
314
+ private detectShake(): HeadSignal | null {
315
+ if (this.neutralX === null) return null;
316
+ let crossings = 0;
317
+ let prevSide = 0;
318
+ let maxAbs = 0;
319
+ for (const p of this.history) {
320
+ const dx = p.x - this.neutralX;
321
+ const absDx = Math.abs(dx);
322
+ maxAbs = Math.max(maxAbs, absDx);
323
+ // Only commit to a side once the displacement clears the deadband β€”
324
+ // otherwise sub-millimeter jitter near neutral fakes crossings.
325
+ if (absDx < HeadPoseTracker.SHAKE_DEADBAND) continue;
326
+ const side = dx > 0 ? 1 : -1;
327
+ if (prevSide !== 0 && side !== prevSide) crossings += 1;
328
+ prevSide = side;
329
+ }
330
+ if (
331
+ crossings >= HeadPoseTracker.SHAKE_MIN_CROSSINGS &&
332
+ maxAbs >= HeadPoseTracker.SHAKE_AMPLITUDE
333
+ ) {
334
+ return "HEAD_SHAKE";
335
+ }
336
+ return null;
337
+ }
338
+
339
+ private detectNod(now: number): HeadSignal | null {
340
+ if (this.neutralX === null || this.neutralY === null) return null;
341
+ const windowStart = now - HeadPoseTracker.NOD_WINDOW_MS;
342
+ const recent = this.history.filter((p) => p.t >= windowStart);
343
+ if (recent.length < 6) return null;
344
+
345
+ // Reject if there's significant horizontal motion β€” that's a shake/sway.
346
+ let maxAbsDx = 0;
347
+ for (const p of recent) {
348
+ maxAbsDx = Math.max(maxAbsDx, Math.abs(p.x - this.neutralX));
349
+ }
350
+ if (maxAbsDx > HeadPoseTracker.NOD_MAX_HORIZONTAL) return null;
351
+
352
+ // Find the peak (lowest head position) within the window.
353
+ let maxDrop = 0;
354
+ let peakIdx = -1;
355
+ for (let i = 0; i < recent.length; i++) {
356
+ const drop = recent[i].y - this.neutralY;
357
+ if (drop > maxDrop) {
358
+ maxDrop = drop;
359
+ peakIdx = i;
360
+ }
361
+ }
362
+ if (maxDrop < HeadPoseTracker.NOD_DROP || peakIdx < 0) return null;
363
+
364
+ // Find a near-neutral start before the peak β€” a nod is a deliberate
365
+ // motion *from* neutral, not a recovery from an already-tilted pose.
366
+ let startIdx = -1;
367
+ for (let i = peakIdx - 1; i >= 0; i--) {
368
+ if (
369
+ recent[i].y - this.neutralY <=
370
+ HeadPoseTracker.NOD_START_THRESHOLD
371
+ ) {
372
+ startIdx = i;
373
+ break;
374
+ }
375
+ }
376
+ if (
377
+ startIdx < 0 ||
378
+ peakIdx - startIdx < HeadPoseTracker.NOD_MIN_DROP_FRAMES
379
+ ) {
380
+ return null;
381
+ }
382
+
383
+ // Recovery: head must return near neutral after the peak.
384
+ let recoveryIdx = -1;
385
+ for (let i = peakIdx + 1; i < recent.length; i++) {
386
+ if (recent[i].y - this.neutralY < HeadPoseTracker.NOD_RECOVERY) {
387
+ recoveryIdx = i;
388
+ break;
389
+ }
390
+ }
391
+ if (
392
+ recoveryIdx < 0 ||
393
+ recoveryIdx - peakIdx < HeadPoseTracker.NOD_MIN_RECOVERY_FRAMES
394
+ ) {
395
+ return null;
396
+ }
397
+
398
+ return "HEAD_NOD_DISSATISFIED";
399
+ }
400
+
401
+ reset(): void {
402
+ this.neutralX = null;
403
+ this.neutralY = null;
404
+ this.history = [];
405
+ this.lastEmitTs = 0;
406
+ }
407
+
408
+ get calibrated(): boolean {
409
+ return this.neutralX !== null && this.neutralY !== null;
410
+ }
411
+ }
412
+
413
  // ── Air-writing DTW (ported from backend/sensing/air_writing.py) ─────────────
414
 
415
  const INDEX_TIP = 8;
frontend/src/types.ts CHANGED
@@ -1,12 +1,24 @@
1
  export type Affect = "HAPPY" | "FRUSTRATED" | "NEUTRAL" | "SURPRISED";
2
  export type GestureName = "THUMBS_UP" | "THUMBS_DOWN" | "POINTING" | "WAVING";
3
  export type MemoryBucket = "family" | "medical" | "hobbies" | "daily_routine" | "social";
 
 
 
 
 
 
 
 
 
4
 
5
  export interface SensingState {
6
  affect: Affect | null;
7
  gestureTag: GestureName | null;
8
  gazeBucket: MemoryBucket | null;
9
  airWrittenText: string;
 
 
 
10
  }
11
 
12
  export interface Persona {
@@ -23,6 +35,13 @@ export interface ChatRequest {
23
  gesture_tag: GestureName | null;
24
  gaze_bucket: MemoryBucket | null;
25
  air_written_text: string | null;
 
 
 
 
 
 
 
26
  }
27
 
28
  export interface LatencyLog {
@@ -57,6 +76,7 @@ export interface ChatResponse {
57
  latency: LatencyLog;
58
  guardrail_passed: boolean;
59
  eval_scores: EvalScores | null;
 
60
  }
61
 
62
  export interface ChatMessage {
@@ -65,4 +85,7 @@ export interface ChatMessage {
65
  latency?: LatencyLog;
66
  affect?: string;
67
  evalScores?: EvalScores | null;
 
 
 
68
  }
 
1
  export type Affect = "HAPPY" | "FRUSTRATED" | "NEUTRAL" | "SURPRISED";
2
  export type GestureName = "THUMBS_UP" | "THUMBS_DOWN" | "POINTING" | "WAVING";
3
  export type MemoryBucket = "family" | "medical" | "hobbies" | "daily_routine" | "social";
4
+ export type HeadSignal = "HEAD_SHAKE" | "HEAD_NOD_DISSATISFIED";
5
+
6
+ export interface HeadDebug {
7
+ dx: number;
8
+ dy: number;
9
+ maxAbsDx: number;
10
+ maxAbsDy: number;
11
+ crossings: number;
12
+ }
13
 
14
  export interface SensingState {
15
  affect: Affect | null;
16
  gestureTag: GestureName | null;
17
  gazeBucket: MemoryBucket | null;
18
  airWrittenText: string;
19
+ headSignal: HeadSignal | null;
20
+ headCalibrated: boolean;
21
+ headDebug: HeadDebug;
22
  }
23
 
24
  export interface Persona {
 
35
  gesture_tag: GestureName | null;
36
  gaze_bucket: MemoryBucket | null;
37
  air_written_text: string | null;
38
+ head_signal?: HeadSignal | null;
39
+ }
40
+
41
+ export interface TurnaroundRequest {
42
+ user_id: string;
43
+ turn_id?: number;
44
+ head_signal?: HeadSignal | null;
45
  }
46
 
47
  export interface LatencyLog {
 
76
  latency: LatencyLog;
77
  guardrail_passed: boolean;
78
  eval_scores: EvalScores | null;
79
+ turn_id: number;
80
  }
81
 
82
  export interface ChatMessage {
 
85
  latency?: LatencyLog;
86
  affect?: string;
87
  evalScores?: EvalScores | null;
88
+ turnId?: number;
89
+ rephrased?: boolean;
90
+ isTurnaround?: boolean;
91
  }