Spaces:
Sleeping
imultimodal sensing → real stylistic constraints
Browse filesAffect, gestures, and air-writing now actually steer the LLM's
word choice instead of just being metadata. Each emotion maps to
a StyleDirective (register, prefer/avoid words, opener hint,
exemplar) that's rendered as explicit instructions in the
per-turn user message; gestures override the opener when present;
air-writing recognises 8 single-stroke shapes (yes, ?, hi, help,
done, more, water, stop) and both biases retrieval via bucket
keywords and gets incorporated verbatim by the planner.
Fixes along the way:
- LCP was measuring mouth x-drift, not vertical lip-corner pull.
Rewrote it as (mouth_centre.y - corner_avg.y) / inter_ocular
and retuned thresholds; FRUSTRATED now has a second trigger
path (brows lowered + squinting).
- Calibration now averages the first 30 frames instead of a
single-frame snapshot. Affect stays null during calibration
but gaze/gesture/air-writing still flow.
- Deepcopy the shared _AFFECT_CONFIG in both intent paths so
downstream mutations can't corrupt the module constant.
- compute_multimodal_alignment now returns non-zero scores
(affect via sentiment lexicon, gesture via opener regex,
gaze via retrieved-chunk bucket match).
- LLM temperature 0.4 → 0.8 so the sensing→output link is
actually visible in the response.
- README.md +13 -8
- backend/evals/multimodal_alignment.py +94 -7
- backend/main.py +5 -6
- backend/pipeline/nodes/intent.py +44 -1
- backend/pipeline/nodes/planner.py +66 -74
- backend/pipeline/state.py +12 -1
- backend/sensing/bucket_keywords.py +20 -3
- backend/sensing/labels.py +17 -5
- frontend/src/hooks/useSensing.ts +28 -13
- frontend/src/lib/airTemplates.ts +108 -0
- frontend/src/lib/sensing.ts +32 -7
|
@@ -288,7 +288,7 @@ multimodal_aac_chatbot/
|
|
| 288 |
│ │ ├── graph.py run_pipeline() — plain function chain
|
| 289 |
│ │ ├── state.py PipelineState TypedDict
|
| 290 |
│ │ └── nodes/ intent, retrieval, planner, feedback
|
| 291 |
-
│ ├── sensing/labels.py
|
| 292 |
│ ├── retrieval/ BGE embeddings (torch tensor) + bucket priors
|
| 293 |
│ ├── generation/llm_client.py 2-tier Ollama Cloud LLM client (primary/fallback)
|
| 294 |
│ └── guardrails/checks.py Input + output safety checks
|
|
@@ -340,7 +340,7 @@ Adding a new persona: drop a JSON file into `data/memories/` following the schem
|
|
| 340 |
|
| 341 |
From the spec (pages 10–11). Tags: **[Core]** = must do, **[Bonus]** = nice to have, **[Eval]** = for the grade.
|
| 342 |
|
| 343 |
-
Heads up: all camera/sensing stuff is in the frontend (MediaPipe JS). Backend just gets the labels (`affect`, `gesture_tag`, `gaze_bucket`).
|
| 344 |
|
| 345 |
### Dataset
|
| 346 |
|
|
@@ -359,10 +359,13 @@ Heads up: all camera/sensing stuff is in the frontend (MediaPipe JS). Backend ju
|
|
| 359 |
- [x] intent-aware turnaround: PERSONAL re-retrieves excluding the rejected bucket *and* exact rejected chunk texts (with `turnaround_min_score` floor — falls back to original chunks rather than degrading); PRESENT_STATE flips emotional read or admits uncertainty
|
| 360 |
- [x] UI: rejected bubble gets strikethrough + "rephrased" badge, new bubble appended with "↻ turnaround" badge — both visible (you can't unsay something to a partner). Manual "↻ Not quite right" button as fallback
|
| 361 |
- [x] guards: `turnaroundConsumedTurnRef` prevents self-retrigger loops; backend `turn_id` returned in `ChatResponse` so frontend doesn't desync on persona switch; stale-turn 409
|
| 362 |
-
- [
|
| 363 |
-
-
|
|
|
|
|
|
|
|
|
|
| 364 |
- [ ] **[Bonus]** Voice + air-writing conflict resolution. Capture short voice (Web Speech API), compare to air-written intent, send a `resolved_intent`
|
| 365 |
-
- [ ]
|
| 366 |
|
| 367 |
### Intent decomposition
|
| 368 |
|
|
@@ -386,6 +389,7 @@ Heads up: all camera/sensing stuff is in the frontend (MediaPipe JS). Backend ju
|
|
| 386 |
- [ ] **[Core]** API returns one response. Should return multiple candidates so the user can pick (and so the next item works)
|
| 387 |
- [ ] **[Core]** Frontend needs a candidate picker — show all the options, let the user click one, send the selection back
|
| 388 |
- [ ] **[Bonus]** When user picks a candidate, save the `(query, picked)` pair to a side vector index and check it first next turn
|
|
|
|
| 389 |
|
| 390 |
### Evals
|
| 391 |
|
|
@@ -395,20 +399,21 @@ Live per-turn scores show up in the `EvalPanel`. State:
|
|
| 395 |
|--------|--------|
|
| 396 |
| Efficiency | works (SLO check on `t_total`) |
|
| 397 |
| Faithfulness | stub, returns 0 |
|
| 398 |
-
| Multimodal alignment |
|
| 399 |
| Authenticity | star rating in UI but not saved |
|
| 400 |
|
| 401 |
- [ ] **[Eval]** Faithfulness — actually check if the response is grounded in what we retrieved. NLI model, sentence-level. If we didn't retrieve anything, flag `no_evidence` instead of pretending we scored it
|
| 402 |
- [ ] **[Eval]** Efficiency — per-turn SLO check is done, but for the writeup we need aggregate latency: p50/p95 across a fixed query set, broken out by LLM tier. Spec target is < 6s
|
| 403 |
-
- [
|
| 404 |
- [ ] **[Eval]** Authenticity — the Likert stars are wired up in the UI but go nowhere. Save them, log them with the turn so we can actually look at them later
|
| 405 |
- [ ] **[Eval]** For the live in-class eval: figure out the actual session — who rates (partners + experts per spec), how many turns each, what gets shown to them. The Likert form is the easy part; the protocol isn't written down anywhere
|
| 406 |
- [ ] **[Eval]** Need an offline version of all three model-driven evals (faithfulness / alignment / efficiency). Aggregate numbers across a fixed query set per persona for the writeup
|
| 407 |
|
| 408 |
### Cleanup
|
| 409 |
|
| 410 |
-
- [ ] move the affect→
|
| 411 |
- [x] delete `backend/sensing/` (dead code, sensing is in frontend) — done, only `labels.py` remains
|
|
|
|
| 412 |
|
| 413 |
---
|
| 414 |
|
|
|
|
| 288 |
│ │ ├── graph.py run_pipeline() — plain function chain
|
| 289 |
│ │ ├── state.py PipelineState TypedDict
|
| 290 |
│ │ └── nodes/ intent, retrieval, planner, feedback
|
| 291 |
+
│ ├── sensing/labels.py GESTURE_DIRECTIVES (sensing runs in browser)
|
| 292 |
│ ├── retrieval/ BGE embeddings (torch tensor) + bucket priors
|
| 293 |
│ ├── generation/llm_client.py 2-tier Ollama Cloud LLM client (primary/fallback)
|
| 294 |
│ └── guardrails/checks.py Input + output safety checks
|
|
|
|
| 340 |
|
| 341 |
From the spec (pages 10–11). Tags: **[Core]** = must do, **[Bonus]** = nice to have, **[Eval]** = for the grade.
|
| 342 |
|
| 343 |
+
Heads up: all camera/sensing stuff is in the frontend (MediaPipe JS). Backend just gets the labels (`affect`, `gesture_tag`, `gaze_bucket`). Only `backend/sensing/labels.py` (`GESTURE_DIRECTIVES`) lives on the backend.
|
| 344 |
|
| 345 |
### Dataset
|
| 346 |
|
|
|
|
| 359 |
- [x] intent-aware turnaround: PERSONAL re-retrieves excluding the rejected bucket *and* exact rejected chunk texts (with `turnaround_min_score` floor — falls back to original chunks rather than degrading); PRESENT_STATE flips emotional read or admits uncertainty
|
| 360 |
- [x] UI: rejected bubble gets strikethrough + "rephrased" badge, new bubble appended with "↻ turnaround" badge — both visible (you can't unsay something to a partner). Manual "↻ Not quite right" button as fallback
|
| 361 |
- [x] guards: `turnaroundConsumedTurnRef` prevents self-retrigger loops; backend `turn_id` returned in `ChatResponse` so frontend doesn't desync on persona switch; stale-turn 409
|
| 362 |
+
- [x] **[Core]** Smile / positive affect actually changes wording now. Affect compiles into a `StyleDirective` (register + prefer/avoid words + exemplar + opener hint) rendered as explicit instructions in the turn-specific user message — see `_AFFECT_CONFIG` in [backend/pipeline/nodes/intent.py](backend/pipeline/nodes/intent.py) and `_build_user` in [backend/pipeline/nodes/planner.py](backend/pipeline/nodes/planner.py). The persona's own `stylistic_preferences` (from the memory JSONs) carry the stable baseline in the cached system message; the affect directive is how that baseline shifts per turn. Measured by `compute_multimodal_alignment` (positive/negative lexicon).
|
| 363 |
+
- Fixed a long-standing bug where LCP (lip-corner pull) was accidentally the *x-coordinate* of the mouth centre, so it drifted on head turns and almost never fired FRUSTRATED. Now measured as vertical pull of the corners relative to mouth centre, normalised by inter-ocular distance. HAPPY/FRUSTRATED thresholds retuned to the new scale; FRUSTRATED also triggers on brows-lowered + squinting as a second path. See `computeAffectVector` and `classifyAffect` in [frontend/src/lib/sensing.ts](frontend/src/lib/sensing.ts).
|
| 364 |
+
- Calibration is now averaged over the first 30 frames (~1s of neutral face) instead of a single-frame snapshot — a brief smile at startup used to lock in a biased baseline. Affect stays null during calibration; gaze/head/gesture/air-writing still flow.
|
| 365 |
+
- [x] **[Core]** Gestures (`THUMBS_UP` / `THUMBS_DOWN` / `POINTING` / `WAVING`) now carry an `opener_hint` via `GESTURE_DIRECTIVES` in [backend/sensing/labels.py](backend/sensing/labels.py). A detected thumbs-up overrides the affect opener and tells the LLM to lead with an affirmation.
|
| 366 |
+
- [x] **[Core]** Air-writing carries a default template bank ([frontend/src/lib/airTemplates.ts](frontend/src/lib/airTemplates.ts): `yes` / `?` / `hi` / `help` / `done` / `more` / `water` / `stop`) — all single-stroke shapes so DTW can match reliably. On match, the word flows through the pipeline three ways: (1) retrieval picks up the word as an extra `PERSONAL` sub-intent with a bucket hint (see `infer_bucket` in [backend/sensing/bucket_keywords.py](backend/sensing/bucket_keywords.py) — e.g. `help` → medical, `water` → daily_routine), (2) the planner includes an explicit "the user air-wrote X — incorporate verbatim if appropriate" instruction in the user message, and (3) the word appears in `logs/turns.jsonl` for debugging. The recognizer has a `MATCH_THRESHOLD` reject gate and `console.debug`s on empty-bank / no-match so unrecognised strokes never reach the backend. To add more templates, append entries to `DEFAULT_AIR_TEMPLATES` as 32-point normalised single-stroke trajectories.
|
| 367 |
- [ ] **[Bonus]** Voice + air-writing conflict resolution. Capture short voice (Web Speech API), compare to air-written intent, send a `resolved_intent`
|
| 368 |
+
- [ ] Thumbs-up currently biases the opener via the prompt. Once generation emits N candidates, move this to candidate reranking for a stronger signal.
|
| 369 |
|
| 370 |
### Intent decomposition
|
| 371 |
|
|
|
|
| 389 |
- [ ] **[Core]** API returns one response. Should return multiple candidates so the user can pick (and so the next item works)
|
| 390 |
- [ ] **[Core]** Frontend needs a candidate picker — show all the options, let the user click one, send the selection back
|
| 391 |
- [ ] **[Bonus]** When user picks a candidate, save the `(query, picked)` pair to a side vector index and check it first next turn
|
| 392 |
+
- [x] LLM temperature bumped from 0.4 → 0.8 in [backend/pipeline/nodes/planner.py](backend/pipeline/nodes/planner.py). The old setting produced near-identical responses across turns even when affect/gesture changed, which made the sensing→output link hard to see. 0.8 gives meaningful lexical variation while staying in the persona's voice.
|
| 393 |
|
| 394 |
### Evals
|
| 395 |
|
|
|
|
| 399 |
|--------|--------|
|
| 400 |
| Efficiency | works (SLO check on `t_total`) |
|
| 401 |
| Faithfulness | stub, returns 0 |
|
| 402 |
+
| Multimodal alignment | works — affect (sentiment lexicon), gesture (opener regex), gaze (bucket match) |
|
| 403 |
| Authenticity | star rating in UI but not saved |
|
| 404 |
|
| 405 |
- [ ] **[Eval]** Faithfulness — actually check if the response is grounded in what we retrieved. NLI model, sentence-level. If we didn't retrieve anything, flag `no_evidence` instead of pretending we scored it
|
| 406 |
- [ ] **[Eval]** Efficiency — per-turn SLO check is done, but for the writeup we need aggregate latency: p50/p95 across a fixed query set, broken out by LLM tier. Spec target is < 6s
|
| 407 |
+
- [x] **[Eval]** Multimodal alignment — implemented in `backend/evals/multimodal_alignment.py`. Affect scored by positive/negative lexicon overlap vs. target sentiment, gesture by opener-phrase regex (THUMBS_UP/THUMBS_DOWN/WAVING), gaze by fraction of retrieved chunks matching the looked-at bucket. Returned on every turn as `multimodal_alignment` / `affect_alignment` / `gesture_alignment` / `gaze_alignment`
|
| 408 |
- [ ] **[Eval]** Authenticity — the Likert stars are wired up in the UI but go nowhere. Save them, log them with the turn so we can actually look at them later
|
| 409 |
- [ ] **[Eval]** For the live in-class eval: figure out the actual session — who rates (partners + experts per spec), how many turns each, what gets shown to them. The Likert form is the easy part; the protocol isn't written down anywhere
|
| 410 |
- [ ] **[Eval]** Need an offline version of all three model-driven evals (faithfulness / alignment / efficiency). Aggregate numbers across a fixed query set per persona for the writeup
|
| 411 |
|
| 412 |
### Cleanup
|
| 413 |
|
| 414 |
+
- [ ] move the affect → `StyleDirective` config (`_AFFECT_CONFIG` in [intent.py](backend/pipeline/nodes/intent.py)) and the gesture directives ([labels.py](backend/sensing/labels.py)) out of code into a yaml
|
| 415 |
- [x] delete `backend/sensing/` (dead code, sensing is in frontend) — done, only `labels.py` remains
|
| 416 |
+
- [x] per-persona affect overrides (`_PERSONA_TONE_OVERRIDES`) deleted — redundant with `stylistic_preferences` in the new persona JSONs
|
| 417 |
|
| 418 |
---
|
| 419 |
|
|
@@ -1,5 +1,85 @@
|
|
| 1 |
-
|
| 2 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
|
| 4 |
|
| 5 |
def compute_multimodal_alignment(
|
|
@@ -9,10 +89,17 @@ def compute_multimodal_alignment(
|
|
| 9 |
gaze_bucket: str | None,
|
| 10 |
chunks: list[dict],
|
| 11 |
) -> dict:
|
| 12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
return {
|
| 14 |
-
"overall_score":
|
| 15 |
-
"affect_alignment": 0.0,
|
| 16 |
-
"gesture_alignment": 0.0,
|
| 17 |
-
"gaze_alignment": 0.0,
|
| 18 |
}
|
|
|
|
| 1 |
+
import re
|
| 2 |
+
|
| 3 |
+
_POSITIVE = {
|
| 4 |
+
"glad",
|
| 5 |
+
"love",
|
| 6 |
+
"lucky",
|
| 7 |
+
"happy",
|
| 8 |
+
"great",
|
| 9 |
+
"grateful",
|
| 10 |
+
"fun",
|
| 11 |
+
"wonderful",
|
| 12 |
+
"nice",
|
| 13 |
+
"amazing",
|
| 14 |
+
"delighted",
|
| 15 |
+
"pleased",
|
| 16 |
+
"yes",
|
| 17 |
+
"solid",
|
| 18 |
+
}
|
| 19 |
+
_NEGATIVE = {
|
| 20 |
+
"tired",
|
| 21 |
+
"hard",
|
| 22 |
+
"sorry",
|
| 23 |
+
"unfortunately",
|
| 24 |
+
"bad",
|
| 25 |
+
"awful",
|
| 26 |
+
"regrettably",
|
| 27 |
+
"difficult",
|
| 28 |
+
"frustrating",
|
| 29 |
+
"no",
|
| 30 |
+
"stop",
|
| 31 |
+
}
|
| 32 |
+
|
| 33 |
+
_AFFECT_TARGET = {
|
| 34 |
+
"HAPPY": 1.0,
|
| 35 |
+
"FRUSTRATED": -0.5,
|
| 36 |
+
"NEUTRAL": 0.0,
|
| 37 |
+
"SURPRISED": 0.0,
|
| 38 |
+
}
|
| 39 |
+
|
| 40 |
+
_GESTURE_OPENER_PATTERNS = {
|
| 41 |
+
"THUMBS_UP": re.compile(r"^\s*(yes|yeah|totally|for sure|absolutely|sure)\b", re.I),
|
| 42 |
+
"THUMBS_DOWN": re.compile(r"^\s*(no|nah|not really|i'd rather not)\b", re.I),
|
| 43 |
+
"WAVING": re.compile(r"^\s*(hi|hey|hello)\b", re.I),
|
| 44 |
+
}
|
| 45 |
+
|
| 46 |
+
|
| 47 |
+
def _tokens(text: str) -> set[str]:
|
| 48 |
+
return set(re.findall(r"\b[a-z]+\b", text.lower()))
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
def _sentiment_score(text: str) -> float:
|
| 52 |
+
toks = _tokens(text)
|
| 53 |
+
pos = len(toks & _POSITIVE)
|
| 54 |
+
neg = len(toks & _NEGATIVE)
|
| 55 |
+
if pos == 0 and neg == 0:
|
| 56 |
+
return 0.0
|
| 57 |
+
return (pos - neg) / (pos + neg)
|
| 58 |
+
|
| 59 |
+
|
| 60 |
+
def _affect_alignment(response: str, affect: str | None) -> float:
|
| 61 |
+
if not affect:
|
| 62 |
+
return 0.0
|
| 63 |
+
target = _AFFECT_TARGET.get(affect, 0.0)
|
| 64 |
+
score = _sentiment_score(response)
|
| 65 |
+
# distance in [0, 2] → similarity in [0, 1]
|
| 66 |
+
return max(0.0, 1.0 - abs(score - target) / 2.0)
|
| 67 |
+
|
| 68 |
+
|
| 69 |
+
def _gesture_alignment(response: str, gesture_tag: str | None) -> float:
|
| 70 |
+
if not gesture_tag:
|
| 71 |
+
return 0.0
|
| 72 |
+
pattern = _GESTURE_OPENER_PATTERNS.get(gesture_tag)
|
| 73 |
+
if pattern is None:
|
| 74 |
+
return 0.5 # gesture has no testable opener; give partial credit
|
| 75 |
+
return 1.0 if pattern.search(response) else 0.0
|
| 76 |
+
|
| 77 |
+
|
| 78 |
+
def _gaze_alignment(chunks: list[dict], gaze_bucket: str | None) -> float:
|
| 79 |
+
if not gaze_bucket or not chunks:
|
| 80 |
+
return 0.0
|
| 81 |
+
matches = sum(1 for c in chunks if c.get("bucket") == gaze_bucket)
|
| 82 |
+
return matches / len(chunks)
|
| 83 |
|
| 84 |
|
| 85 |
def compute_multimodal_alignment(
|
|
|
|
| 89 |
gaze_bucket: str | None,
|
| 90 |
chunks: list[dict],
|
| 91 |
) -> dict:
|
| 92 |
+
scores: dict[str, float] = {}
|
| 93 |
+
if affect:
|
| 94 |
+
scores["affect_alignment"] = _affect_alignment(response, affect)
|
| 95 |
+
if gesture_tag:
|
| 96 |
+
scores["gesture_alignment"] = _gesture_alignment(response, gesture_tag)
|
| 97 |
+
if gaze_bucket:
|
| 98 |
+
scores["gaze_alignment"] = _gaze_alignment(chunks, gaze_bucket)
|
| 99 |
+
overall = sum(scores.values()) / len(scores) if scores else 0.0
|
| 100 |
return {
|
| 101 |
+
"overall_score": round(overall, 4),
|
| 102 |
+
"affect_alignment": round(scores.get("affect_alignment", 0.0), 4),
|
| 103 |
+
"gesture_alignment": round(scores.get("gesture_alignment", 0.0), 4),
|
| 104 |
+
"gaze_alignment": round(scores.get("gaze_alignment", 0.0), 4),
|
| 105 |
}
|
|
@@ -2,6 +2,7 @@
|
|
| 2 |
from __future__ import annotations
|
| 3 |
|
| 4 |
import argparse
|
|
|
|
| 5 |
import json
|
| 6 |
import os
|
| 7 |
import sys
|
|
@@ -10,6 +11,7 @@ import time
|
|
| 10 |
from backend.config.settings import settings
|
| 11 |
from backend.guardrails.checks import check_input
|
| 12 |
from backend.pipeline.graph import run_pipeline
|
|
|
|
| 13 |
from backend.pipeline.state import GenerationConfig, PipelineState
|
| 14 |
from backend.retrieval.bucket_priors import uniform_priors
|
| 15 |
from backend.retrieval.vector_store import _get_embedder
|
|
@@ -49,6 +51,7 @@ def _keyword_intent(query: str) -> tuple[dict, GenerationConfig]:
|
|
| 49 |
else "PERSONAL"
|
| 50 |
)
|
| 51 |
|
|
|
|
| 52 |
route = {
|
| 53 |
"sub_intents": [
|
| 54 |
{
|
|
@@ -66,12 +69,8 @@ def _keyword_intent(query: str) -> tuple[dict, GenerationConfig]:
|
|
| 66 |
},
|
| 67 |
"affect": "NEUTRAL",
|
| 68 |
}
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
"tone_tag": "[TONE:DEFAULT]",
|
| 72 |
-
"retrieval_mode": "full",
|
| 73 |
-
"persona_mod": "baseline",
|
| 74 |
-
}
|
| 75 |
return route, gen_config
|
| 76 |
|
| 77 |
|
|
|
|
| 2 |
from __future__ import annotations
|
| 3 |
|
| 4 |
import argparse
|
| 5 |
+
import copy
|
| 6 |
import json
|
| 7 |
import os
|
| 8 |
import sys
|
|
|
|
| 11 |
from backend.config.settings import settings
|
| 12 |
from backend.guardrails.checks import check_input
|
| 13 |
from backend.pipeline.graph import run_pipeline
|
| 14 |
+
from backend.pipeline.nodes.intent import _AFFECT_CONFIG
|
| 15 |
from backend.pipeline.state import GenerationConfig, PipelineState
|
| 16 |
from backend.retrieval.bucket_priors import uniform_priors
|
| 17 |
from backend.retrieval.vector_store import _get_embedder
|
|
|
|
| 51 |
else "PERSONAL"
|
| 52 |
)
|
| 53 |
|
| 54 |
+
# `style_constraints` is vestigial — planner reads `generation_config` (below) as the source of truth.
|
| 55 |
route = {
|
| 56 |
"sub_intents": [
|
| 57 |
{
|
|
|
|
| 69 |
},
|
| 70 |
"affect": "NEUTRAL",
|
| 71 |
}
|
| 72 |
+
# Deep-copy: callers may mutate gen_config downstream; never hand them the shared constant.
|
| 73 |
+
gen_config: GenerationConfig = copy.deepcopy(_AFFECT_CONFIG["NEUTRAL"])
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
return route, gen_config
|
| 75 |
|
| 76 |
|
|
@@ -1,6 +1,7 @@
|
|
| 1 |
# Intent decomposition node — regex-split fragments + BGE zero-shot classifier.
|
| 2 |
from __future__ import annotations
|
| 3 |
|
|
|
|
| 4 |
import re
|
| 5 |
import time
|
| 6 |
from functools import lru_cache
|
|
@@ -88,24 +89,65 @@ _AFFECT_CONFIG: dict[str, GenerationConfig] = {
|
|
| 88 |
"tone_tag": "[TONE:WARM]",
|
| 89 |
"retrieval_mode": "full",
|
| 90 |
"persona_mod": "amplify_quirks",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 91 |
},
|
| 92 |
"FRUSTRATED": {
|
| 93 |
"max_tokens": settings.max_tokens_frustrated,
|
| 94 |
"tone_tag": "[TONE:DIRECT_EMPATHETIC]",
|
| 95 |
"retrieval_mode": "fast",
|
| 96 |
"persona_mod": "suppress_humor",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 97 |
},
|
| 98 |
"NEUTRAL": {
|
| 99 |
"max_tokens": settings.max_tokens_neutral,
|
| 100 |
"tone_tag": "[TONE:DEFAULT]",
|
| 101 |
"retrieval_mode": "full",
|
| 102 |
"persona_mod": "baseline",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 103 |
},
|
| 104 |
"SURPRISED": {
|
| 105 |
"max_tokens": settings.max_tokens_surprised,
|
| 106 |
"tone_tag": "[TONE:CLARIFYING]",
|
| 107 |
"retrieval_mode": "full",
|
| 108 |
"persona_mod": "add_confirmation",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 109 |
},
|
| 110 |
}
|
| 111 |
|
|
@@ -185,7 +227,8 @@ def run(state: PipelineState) -> dict:
|
|
| 185 |
affect_state = state.get("affect") or {}
|
| 186 |
emotion: str = affect_state.get("emotion", "NEUTRAL")
|
| 187 |
query: str = state["raw_query"]
|
| 188 |
-
gen_config
|
|
|
|
| 189 |
|
| 190 |
fragments = _split_query(query)
|
| 191 |
priority = "fast" if emotion == "FRUSTRATED" else "normal"
|
|
|
|
| 1 |
# Intent decomposition node — regex-split fragments + BGE zero-shot classifier.
|
| 2 |
from __future__ import annotations
|
| 3 |
|
| 4 |
+
import copy
|
| 5 |
import re
|
| 6 |
import time
|
| 7 |
from functools import lru_cache
|
|
|
|
| 89 |
"tone_tag": "[TONE:WARM]",
|
| 90 |
"retrieval_mode": "full",
|
| 91 |
"persona_mod": "amplify_quirks",
|
| 92 |
+
"style": {
|
| 93 |
+
"tone_tag": "[TONE:WARM]",
|
| 94 |
+
"register": "warm, upbeat, affectionate",
|
| 95 |
+
"prefer_words": [
|
| 96 |
+
"glad",
|
| 97 |
+
"love",
|
| 98 |
+
"lucky",
|
| 99 |
+
"happy",
|
| 100 |
+
"great",
|
| 101 |
+
"grateful",
|
| 102 |
+
"fun",
|
| 103 |
+
],
|
| 104 |
+
"avoid_words": ["unfortunately", "frankly", "tired", "hard", "sorry"],
|
| 105 |
+
"opener_hint": None,
|
| 106 |
+
"exemplar": "Yeah — honestly, that made my week.",
|
| 107 |
+
},
|
| 108 |
},
|
| 109 |
"FRUSTRATED": {
|
| 110 |
"max_tokens": settings.max_tokens_frustrated,
|
| 111 |
"tone_tag": "[TONE:DIRECT_EMPATHETIC]",
|
| 112 |
"retrieval_mode": "fast",
|
| 113 |
"persona_mod": "suppress_humor",
|
| 114 |
+
"style": {
|
| 115 |
+
"tone_tag": "[TONE:DIRECT_EMPATHETIC]",
|
| 116 |
+
"register": "direct, short, validating — no jokes",
|
| 117 |
+
"prefer_words": ["okay", "yes", "right", "i hear you", "fair"],
|
| 118 |
+
"avoid_words": ["hilarious", "ha", "lol", "cheerful", "delightful"],
|
| 119 |
+
"opener_hint": "Acknowledge the feeling in 3-5 words before the answer.",
|
| 120 |
+
"exemplar": "Yeah. That's a lot. Short answer: yes.",
|
| 121 |
+
},
|
| 122 |
},
|
| 123 |
"NEUTRAL": {
|
| 124 |
"max_tokens": settings.max_tokens_neutral,
|
| 125 |
"tone_tag": "[TONE:DEFAULT]",
|
| 126 |
"retrieval_mode": "full",
|
| 127 |
"persona_mod": "baseline",
|
| 128 |
+
"style": {
|
| 129 |
+
"tone_tag": "[TONE:DEFAULT]",
|
| 130 |
+
"register": "natural, conversational",
|
| 131 |
+
"prefer_words": [],
|
| 132 |
+
"avoid_words": [],
|
| 133 |
+
"opener_hint": None,
|
| 134 |
+
# Empty on purpose — let the persona's own example_phrases carry the register.
|
| 135 |
+
"exemplar": "",
|
| 136 |
+
},
|
| 137 |
},
|
| 138 |
"SURPRISED": {
|
| 139 |
"max_tokens": settings.max_tokens_surprised,
|
| 140 |
"tone_tag": "[TONE:CLARIFYING]",
|
| 141 |
"retrieval_mode": "full",
|
| 142 |
"persona_mod": "add_confirmation",
|
| 143 |
+
"style": {
|
| 144 |
+
"tone_tag": "[TONE:CLARIFYING]",
|
| 145 |
+
"register": "curious, clarifying",
|
| 146 |
+
"prefer_words": ["really", "wait", "huh", "oh"],
|
| 147 |
+
"avoid_words": [],
|
| 148 |
+
"opener_hint": "Mirror surprise briefly, then ask a clarifying question.",
|
| 149 |
+
"exemplar": "Oh — wait, really? Did you mean the Friday one?",
|
| 150 |
+
},
|
| 151 |
},
|
| 152 |
}
|
| 153 |
|
|
|
|
| 227 |
affect_state = state.get("affect") or {}
|
| 228 |
emotion: str = affect_state.get("emotion", "NEUTRAL")
|
| 229 |
query: str = state["raw_query"]
|
| 230 |
+
# Deep-copy: callers may mutate gen_config downstream; never hand them the shared constant.
|
| 231 |
+
gen_config = copy.deepcopy(_AFFECT_CONFIG.get(emotion, _AFFECT_CONFIG["NEUTRAL"]))
|
| 232 |
|
| 233 |
fragments = _split_query(query)
|
| 234 |
priority = "fast" if emotion == "FRUSTRATED" else "normal"
|
|
@@ -1,30 +1,35 @@
|
|
| 1 |
-
# Planner node — prompt building, candidate generation, composite ranking.
|
| 2 |
-
from __future__ import annotations
|
| 3 |
-
|
| 4 |
import time
|
| 5 |
|
| 6 |
from backend.config.settings import settings
|
| 7 |
from backend.generation.llm_client import active_model, chat_complete
|
| 8 |
from backend.guardrails.checks import check_output
|
| 9 |
from backend.pipeline.intent_kind import classify_intent_kind
|
| 10 |
-
from backend.pipeline.state import PipelineState
|
| 11 |
-
from backend.sensing.labels import
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
"
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
"
|
| 22 |
-
"
|
| 23 |
-
|
| 24 |
-
"
|
| 25 |
-
"
|
| 26 |
-
"
|
| 27 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
}
|
| 29 |
|
| 30 |
|
|
@@ -36,22 +41,16 @@ def run_fallback(state: PipelineState) -> dict:
|
|
| 36 |
return _run(state, tier="fallback")
|
| 37 |
|
| 38 |
|
| 39 |
-
# ── Core implementation ────────────────────────────────────────────────────────
|
| 40 |
-
|
| 41 |
-
|
| 42 |
def _run(state: PipelineState, tier: str) -> dict:
|
| 43 |
t0 = time.perf_counter()
|
| 44 |
|
| 45 |
profile = state["persona_profile"]
|
| 46 |
-
user_id = state["user_id"]
|
| 47 |
affect = (state.get("affect") or {}).get("emotion", "NEUTRAL")
|
| 48 |
gen_cfg = state.get("generation_config") or {}
|
| 49 |
chunks = state.get("retrieved_chunks") or []
|
| 50 |
history = (state.get("session_history") or [])[-20:]
|
| 51 |
|
| 52 |
-
|
| 53 |
-
user_id, affect, gen_cfg.get("tone_tag", "[TONE:DEFAULT]")
|
| 54 |
-
)
|
| 55 |
gesture_tag = state.get("gesture_tag")
|
| 56 |
air_written_text = state.get("air_written_text")
|
| 57 |
turnaround_triggered = state.get("turnaround_triggered", False)
|
|
@@ -64,7 +63,7 @@ def _run(state: PipelineState, tier: str) -> dict:
|
|
| 64 |
chunks,
|
| 65 |
history,
|
| 66 |
state["raw_query"],
|
| 67 |
-
|
| 68 |
gen_cfg,
|
| 69 |
gesture_tag=gesture_tag,
|
| 70 |
air_written_text=air_written_text,
|
|
@@ -76,7 +75,7 @@ def _run(state: PipelineState, tier: str) -> dict:
|
|
| 76 |
selected = chat_complete(
|
| 77 |
messages=messages,
|
| 78 |
max_tokens=gen_cfg.get("max_tokens", settings.max_tokens_neutral),
|
| 79 |
-
temperature=0.
|
| 80 |
tier=tier,
|
| 81 |
)
|
| 82 |
|
|
@@ -95,7 +94,7 @@ def _run(state: PipelineState, tier: str) -> dict:
|
|
| 95 |
4,
|
| 96 |
)
|
| 97 |
|
| 98 |
-
augmented_prompt = "\n\n".join(m[
|
| 99 |
return {
|
| 100 |
"augmented_prompt": augmented_prompt,
|
| 101 |
"candidates": [selected],
|
|
@@ -107,8 +106,8 @@ def _run(state: PipelineState, tier: str) -> dict:
|
|
| 107 |
}
|
| 108 |
|
| 109 |
|
| 110 |
-
def
|
| 111 |
-
return
|
| 112 |
|
| 113 |
|
| 114 |
_AFFECT_HINTS = {
|
|
@@ -124,7 +123,7 @@ def _build_messages(
|
|
| 124 |
chunks: list[dict],
|
| 125 |
history: list[dict],
|
| 126 |
query: str,
|
| 127 |
-
|
| 128 |
gen_cfg: dict,
|
| 129 |
gesture_tag: str | None = None,
|
| 130 |
air_written_text: str | None = None,
|
|
@@ -141,7 +140,7 @@ def _build_messages(
|
|
| 141 |
chunks,
|
| 142 |
history,
|
| 143 |
query,
|
| 144 |
-
|
| 145 |
gen_cfg,
|
| 146 |
gesture_tag,
|
| 147 |
air_written_text,
|
|
@@ -196,37 +195,11 @@ Answering rules:
|
|
| 196 |
--- end character sheet ---"""
|
| 197 |
|
| 198 |
|
| 199 |
-
_PERSONA_MOD_INSTRUCTIONS = {
|
| 200 |
-
"amplify_quirks": "Amplify your characteristic style and personality.",
|
| 201 |
-
"suppress_humor": "Be direct and supportive. Suppress humor.",
|
| 202 |
-
"baseline": "Use your natural communication style.",
|
| 203 |
-
"add_confirmation": "Add a clarifying question or confirmation at the end.",
|
| 204 |
-
"turnaround": (
|
| 205 |
-
"Your previous reply missed what you actually meant. Rephrase "
|
| 206 |
-
"more directly — change the wording meaningfully, not just "
|
| 207 |
-
"surface tweaks — and end with a one-sentence clarifying "
|
| 208 |
-
"question to confirm you're on the right track."
|
| 209 |
-
),
|
| 210 |
-
"reverse_stance": (
|
| 211 |
-
"Your previous reply was substantively wrong — not poorly worded, "
|
| 212 |
-
"but the wrong content. Take a meaningfully different stance using "
|
| 213 |
-
"the available memories or, if none fit, honestly say you don't "
|
| 214 |
-
"know. Do NOT just reword the previous reply."
|
| 215 |
-
),
|
| 216 |
-
"present_state_retry": (
|
| 217 |
-
"Your previous reply was wrong about your current state. The "
|
| 218 |
-
"affect signal probably misled you. Either flip the emotional "
|
| 219 |
-
"read (if you said 'good', try 'not great') or honestly admit "
|
| 220 |
-
"you're not sure how you feel right now. Do NOT invent details."
|
| 221 |
-
),
|
| 222 |
-
}
|
| 223 |
-
|
| 224 |
-
|
| 225 |
def _build_user(
|
| 226 |
chunks: list[dict],
|
| 227 |
history: list[dict],
|
| 228 |
query: str,
|
| 229 |
-
|
| 230 |
gen_cfg: dict,
|
| 231 |
gesture_tag: str | None,
|
| 232 |
air_written_text: str | None,
|
|
@@ -262,20 +235,41 @@ def _build_user(
|
|
| 262 |
or " (start of session)"
|
| 263 |
)
|
| 264 |
|
| 265 |
-
|
| 266 |
if gesture_tag:
|
| 267 |
-
|
| 268 |
-
|
|
|
|
|
|
|
| 269 |
|
| 270 |
-
|
| 271 |
if air_written_text:
|
| 272 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 273 |
|
| 274 |
-
|
| 275 |
-
|
| 276 |
-
_PERSONA_MOD_INSTRUCTIONS[
|
|
|
|
|
|
|
| 277 |
)
|
| 278 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 279 |
turnaround_line = ""
|
| 280 |
if rejected_response:
|
| 281 |
safe_rejected = rejected_response.replace('"', "'").replace("\n", " ")[:300]
|
|
@@ -287,8 +281,7 @@ def _build_user(
|
|
| 287 |
if intent_kind == "present_state":
|
| 288 |
affect_hint = _AFFECT_HINTS.get(affect, _AFFECT_HINTS["NEUTRAL"])
|
| 289 |
return f"""\
|
| 290 |
-
{
|
| 291 |
-
{persona_instruction}
|
| 292 |
|
| 293 |
The partner is asking about your present state (right now, today).
|
| 294 |
Your autobiographical memories do NOT contain this — do not fabricate details from them.
|
|
@@ -307,8 +300,7 @@ Reply as {persona_name} in 1–2 sentences, first person.
|
|
| 307 |
- Do NOT use autobiographical facts (job, family, hobbies) unless the partner asked."""
|
| 308 |
|
| 309 |
return f"""\
|
| 310 |
-
{
|
| 311 |
-
{persona_instruction}
|
| 312 |
|
| 313 |
Personal memories:
|
| 314 |
{memory_block}
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
import time
|
| 2 |
|
| 3 |
from backend.config.settings import settings
|
| 4 |
from backend.generation.llm_client import active_model, chat_complete
|
| 5 |
from backend.guardrails.checks import check_output
|
| 6 |
from backend.pipeline.intent_kind import classify_intent_kind
|
| 7 |
+
from backend.pipeline.state import PipelineState, StyleDirective
|
| 8 |
+
from backend.sensing.labels import GESTURE_DIRECTIVES
|
| 9 |
+
|
| 10 |
+
_PERSONA_MOD_INSTRUCTIONS = {
|
| 11 |
+
"amplify_quirks": "Amplify your characteristic style and personality.",
|
| 12 |
+
"suppress_humor": "Be direct and supportive. Suppress humor.",
|
| 13 |
+
"baseline": "Use your natural communication style.",
|
| 14 |
+
"add_confirmation": "Add a clarifying question or confirmation at the end.",
|
| 15 |
+
"turnaround": (
|
| 16 |
+
"Your previous reply missed what you actually meant. Rephrase "
|
| 17 |
+
"more directly — change the wording meaningfully, not just "
|
| 18 |
+
"surface tweaks — and end with a one-sentence clarifying "
|
| 19 |
+
"question to confirm you're on the right track."
|
| 20 |
+
),
|
| 21 |
+
"reverse_stance": (
|
| 22 |
+
"Your previous reply was substantively wrong — not poorly worded, "
|
| 23 |
+
"but the wrong content. Take a meaningfully different stance using "
|
| 24 |
+
"the available memories or, if none fit, honestly say you don't "
|
| 25 |
+
"know. Do NOT just reword the previous reply."
|
| 26 |
+
),
|
| 27 |
+
"present_state_retry": (
|
| 28 |
+
"Your previous reply was wrong about your current state. The "
|
| 29 |
+
"affect signal probably misled you. Either flip the emotional "
|
| 30 |
+
"read (if you said 'good', try 'not great') or honestly admit "
|
| 31 |
+
"you're not sure how you feel right now. Do NOT invent details."
|
| 32 |
+
),
|
| 33 |
}
|
| 34 |
|
| 35 |
|
|
|
|
| 41 |
return _run(state, tier="fallback")
|
| 42 |
|
| 43 |
|
|
|
|
|
|
|
|
|
|
| 44 |
def _run(state: PipelineState, tier: str) -> dict:
|
| 45 |
t0 = time.perf_counter()
|
| 46 |
|
| 47 |
profile = state["persona_profile"]
|
|
|
|
| 48 |
affect = (state.get("affect") or {}).get("emotion", "NEUTRAL")
|
| 49 |
gen_cfg = state.get("generation_config") or {}
|
| 50 |
chunks = state.get("retrieved_chunks") or []
|
| 51 |
history = (state.get("session_history") or [])[-20:]
|
| 52 |
|
| 53 |
+
style: StyleDirective = gen_cfg["style"]
|
|
|
|
|
|
|
| 54 |
gesture_tag = state.get("gesture_tag")
|
| 55 |
air_written_text = state.get("air_written_text")
|
| 56 |
turnaround_triggered = state.get("turnaround_triggered", False)
|
|
|
|
| 63 |
chunks,
|
| 64 |
history,
|
| 65 |
state["raw_query"],
|
| 66 |
+
style,
|
| 67 |
gen_cfg,
|
| 68 |
gesture_tag=gesture_tag,
|
| 69 |
air_written_text=air_written_text,
|
|
|
|
| 75 |
selected = chat_complete(
|
| 76 |
messages=messages,
|
| 77 |
max_tokens=gen_cfg.get("max_tokens", settings.max_tokens_neutral),
|
| 78 |
+
temperature=0.8,
|
| 79 |
tier=tier,
|
| 80 |
)
|
| 81 |
|
|
|
|
| 94 |
4,
|
| 95 |
)
|
| 96 |
|
| 97 |
+
augmented_prompt = "\n\n".join(f"[{m['role']}] {m['content']}" for m in messages)
|
| 98 |
return {
|
| 99 |
"augmented_prompt": augmented_prompt,
|
| 100 |
"candidates": [selected],
|
|
|
|
| 106 |
}
|
| 107 |
|
| 108 |
|
| 109 |
+
def _format_word_list(words: list[str]) -> str:
|
| 110 |
+
return ", ".join(words) if words else "(no constraint)"
|
| 111 |
|
| 112 |
|
| 113 |
_AFFECT_HINTS = {
|
|
|
|
| 123 |
chunks: list[dict],
|
| 124 |
history: list[dict],
|
| 125 |
query: str,
|
| 126 |
+
style: StyleDirective,
|
| 127 |
gen_cfg: dict,
|
| 128 |
gesture_tag: str | None = None,
|
| 129 |
air_written_text: str | None = None,
|
|
|
|
| 140 |
chunks,
|
| 141 |
history,
|
| 142 |
query,
|
| 143 |
+
style,
|
| 144 |
gen_cfg,
|
| 145 |
gesture_tag,
|
| 146 |
air_written_text,
|
|
|
|
| 195 |
--- end character sheet ---"""
|
| 196 |
|
| 197 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 198 |
def _build_user(
|
| 199 |
chunks: list[dict],
|
| 200 |
history: list[dict],
|
| 201 |
query: str,
|
| 202 |
+
style: StyleDirective,
|
| 203 |
gen_cfg: dict,
|
| 204 |
gesture_tag: str | None,
|
| 205 |
air_written_text: str | None,
|
|
|
|
| 235 |
or " (start of session)"
|
| 236 |
)
|
| 237 |
|
| 238 |
+
merged_opener = style.get("opener_hint")
|
| 239 |
if gesture_tag:
|
| 240 |
+
directive = GESTURE_DIRECTIVES.get(gesture_tag)
|
| 241 |
+
if directive:
|
| 242 |
+
# Gesture opener wins over affect opener — a deliberate thumbs-up is a stronger signal than inferred affect.
|
| 243 |
+
merged_opener = directive["opener_hint"]
|
| 244 |
|
| 245 |
+
air_writing_block = ""
|
| 246 |
if air_written_text:
|
| 247 |
+
air_writing_block = (
|
| 248 |
+
f'\nThe user air-wrote: "{air_written_text}". '
|
| 249 |
+
"If this looks like a name, noun, or short phrase, "
|
| 250 |
+
"incorporate it verbatim into your response; "
|
| 251 |
+
"otherwise use it as a hint about what they're trying to say."
|
| 252 |
+
)
|
| 253 |
|
| 254 |
+
persona_mod = gen_cfg.get("persona_mod", "baseline")
|
| 255 |
+
persona_instruction_line = (
|
| 256 |
+
f"\n{_PERSONA_MOD_INSTRUCTIONS[persona_mod]}"
|
| 257 |
+
if persona_mod in _PERSONA_MOD_INSTRUCTIONS and persona_mod != "baseline"
|
| 258 |
+
else ""
|
| 259 |
)
|
| 260 |
|
| 261 |
+
directive_lines = [
|
| 262 |
+
f"- Register: {style['register']}",
|
| 263 |
+
f"- Prefer words like: {_format_word_list(style['prefer_words'])}",
|
| 264 |
+
f"- Avoid words like: {_format_word_list(style['avoid_words'])}",
|
| 265 |
+
f"- Opener: {merged_opener or 'no constraint'}",
|
| 266 |
+
]
|
| 267 |
+
if style.get("exemplar"):
|
| 268 |
+
directive_lines.append(
|
| 269 |
+
f'- In this register, a sentence sounds like: "{style["exemplar"]}"'
|
| 270 |
+
)
|
| 271 |
+
directive_block = "Style directive:\n" + "\n".join(directive_lines)
|
| 272 |
+
|
| 273 |
turnaround_line = ""
|
| 274 |
if rejected_response:
|
| 275 |
safe_rejected = rejected_response.replace('"', "'").replace("\n", " ")[:300]
|
|
|
|
| 281 |
if intent_kind == "present_state":
|
| 282 |
affect_hint = _AFFECT_HINTS.get(affect, _AFFECT_HINTS["NEUTRAL"])
|
| 283 |
return f"""\
|
| 284 |
+
{directive_block}{air_writing_block}{turnaround_line}{persona_instruction_line}
|
|
|
|
| 285 |
|
| 286 |
The partner is asking about your present state (right now, today).
|
| 287 |
Your autobiographical memories do NOT contain this — do not fabricate details from them.
|
|
|
|
| 300 |
- Do NOT use autobiographical facts (job, family, hobbies) unless the partner asked."""
|
| 301 |
|
| 302 |
return f"""\
|
| 303 |
+
{directive_block}{air_writing_block}{turnaround_line}{persona_instruction_line}
|
|
|
|
| 304 |
|
| 305 |
Personal memories:
|
| 306 |
{memory_block}
|
|
@@ -43,14 +43,25 @@ class IntentRoute(TypedDict):
|
|
| 43 |
affect: str
|
| 44 |
|
| 45 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 46 |
class GenerationConfig(TypedDict):
|
| 47 |
max_tokens: int
|
| 48 |
-
tone_tag: str #
|
| 49 |
retrieval_mode: str # "fast" | "full"
|
| 50 |
persona_mod: str
|
| 51 |
# persona_mod values:
|
| 52 |
# "amplify_quirks" | "suppress_humor" | "baseline"
|
| 53 |
# | "add_confirmation" | "turnaround"
|
|
|
|
|
|
|
| 54 |
|
| 55 |
|
| 56 |
class LatencyLog(TypedDict):
|
|
|
|
| 43 |
affect: str
|
| 44 |
|
| 45 |
|
| 46 |
+
class StyleDirective(TypedDict):
|
| 47 |
+
tone_tag: str # e.g. "[TONE:WARM]" — kept for logging + eval
|
| 48 |
+
register: str # short register phrase, e.g. "warm, upbeat, affectionate"
|
| 49 |
+
prefer_words: list[str] # lexical bias — words to steer toward
|
| 50 |
+
avoid_words: list[str] # anti-patterns — words to steer away from
|
| 51 |
+
opener_hint: str | None # structural hint for the opening clause
|
| 52 |
+
exemplar: str # one short sentence in the target register
|
| 53 |
+
|
| 54 |
+
|
| 55 |
class GenerationConfig(TypedDict):
|
| 56 |
max_tokens: int
|
| 57 |
+
tone_tag: str # legacy tag (kept in sync with style["tone_tag"] for existing log consumers)
|
| 58 |
retrieval_mode: str # "fast" | "full"
|
| 59 |
persona_mod: str
|
| 60 |
# persona_mod values:
|
| 61 |
# "amplify_quirks" | "suppress_humor" | "baseline"
|
| 62 |
# | "add_confirmation" | "turnaround"
|
| 63 |
+
# | "reverse_stance" | "present_state_retry"
|
| 64 |
+
style: StyleDirective
|
| 65 |
|
| 66 |
|
| 67 |
class LatencyLog(TypedDict):
|
|
@@ -1,9 +1,26 @@
|
|
| 1 |
_BUCKET_KEYWORDS: list[tuple[str, tuple[str, ...]]] = [
|
| 2 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
("family", ("family", "mom", "dad", "brother", "sister", "parents")),
|
| 4 |
("hobbies", ("hobby", "like to do", "enjoy", "weekend", "fun")),
|
| 5 |
-
(
|
| 6 |
-
|
|
|
|
|
|
|
|
|
|
| 7 |
]
|
| 8 |
|
| 9 |
|
|
|
|
| 1 |
_BUCKET_KEYWORDS: list[tuple[str, tuple[str, ...]]] = [
|
| 2 |
+
# AAC air-writing templates (help/water/stop/done/more) are mapped here too —
|
| 3 |
+
# when a partner/user signals one of these, retrieval pulls from the matching bucket.
|
| 4 |
+
(
|
| 5 |
+
"medical",
|
| 6 |
+
(
|
| 7 |
+
"medication",
|
| 8 |
+
"medicine",
|
| 9 |
+
"doctor",
|
| 10 |
+
"health",
|
| 11 |
+
"allergic",
|
| 12 |
+
"therapy",
|
| 13 |
+
"help",
|
| 14 |
+
"stop",
|
| 15 |
+
),
|
| 16 |
+
),
|
| 17 |
("family", ("family", "mom", "dad", "brother", "sister", "parents")),
|
| 18 |
("hobbies", ("hobby", "like to do", "enjoy", "weekend", "fun")),
|
| 19 |
+
(
|
| 20 |
+
"daily_routine",
|
| 21 |
+
("routine", "morning", "wake", "sleep", "daily", "water", "done", "more"),
|
| 22 |
+
),
|
| 23 |
+
("social", ("friend", "social", "people", "party", "community", "hi")),
|
| 24 |
]
|
| 25 |
|
| 26 |
|
|
@@ -1,6 +1,18 @@
|
|
| 1 |
-
|
| 2 |
-
"THUMBS_UP":
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
}
|
|
|
|
| 1 |
+
GESTURE_DIRECTIVES: dict[str, dict[str, str]] = {
|
| 2 |
+
"THUMBS_UP": {
|
| 3 |
+
"tone": "[GESTURE:THUMBS_UP][TONE:AFFIRMATIVE]",
|
| 4 |
+
"opener_hint": "Open with an affirmation (Yes / Totally / For sure).",
|
| 5 |
+
},
|
| 6 |
+
"THUMBS_DOWN": {
|
| 7 |
+
"tone": "[GESTURE:THUMBS_DOWN][TONE:NEGATIVE]",
|
| 8 |
+
"opener_hint": "Open by declining or disagreeing briefly.",
|
| 9 |
+
},
|
| 10 |
+
"POINTING": {
|
| 11 |
+
"tone": "[GESTURE:POINTING][INTENT:REFERENTIAL]",
|
| 12 |
+
"opener_hint": "Treat the query as referring to a specific named thing.",
|
| 13 |
+
},
|
| 14 |
+
"WAVING": {
|
| 15 |
+
"tone": "[GESTURE:WAVING][INTENT:GREETING]",
|
| 16 |
+
"opener_hint": "Open with a greeting.",
|
| 17 |
+
},
|
| 18 |
}
|
|
@@ -13,6 +13,7 @@ import {
|
|
| 13 |
AirWriter,
|
| 14 |
HeadPoseTracker,
|
| 15 |
} from "../lib/sensing";
|
|
|
|
| 16 |
|
| 17 |
const EMA_ALPHA = 0.3;
|
| 18 |
|
|
@@ -20,11 +21,12 @@ export function useSensing() {
|
|
| 20 |
const faceLandmarkerRef = useRef<FaceLandmarker | null>(null);
|
| 21 |
const handLandmarkerRef = useRef<HandLandmarker | null>(null);
|
| 22 |
const gazeTrackerRef = useRef(new GazeTracker());
|
| 23 |
-
const airWriterRef = useRef(new AirWriter());
|
| 24 |
const headTrackerRef = useRef(new HeadPoseTracker());
|
| 25 |
const calibratePendingRef = useRef(false);
|
| 26 |
const headDebugRef = useRef({ dx: 0, dy: 0, maxAbsDx: 0, maxAbsDy: 0, crossings: 0 });
|
| 27 |
const neutralLCPRef = useRef<number | null>(null);
|
|
|
|
| 28 |
const smoothedRef = useRef({ MAR: 0, EAR: 0.3, BRI: -0.3, LCP: 0 });
|
| 29 |
const initingRef = useRef(false);
|
| 30 |
const [ready, setReady] = useState(false);
|
|
@@ -108,9 +110,18 @@ export function useSensing() {
|
|
| 108 |
if (faceResult.faceLandmarks && faceResult.faceLandmarks.length > 0) {
|
| 109 |
const landmarks = faceResult.faceLandmarks[0];
|
| 110 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 111 |
if (neutralLCPRef.current === null) {
|
| 112 |
-
|
| 113 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 114 |
}
|
| 115 |
|
| 116 |
if (calibratePendingRef.current) {
|
|
@@ -118,18 +129,21 @@ export function useSensing() {
|
|
| 118 |
calibratePendingRef.current = false;
|
| 119 |
}
|
| 120 |
|
| 121 |
-
|
|
|
|
| 122 |
|
| 123 |
-
|
| 124 |
-
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
|
| 128 |
-
|
| 129 |
-
|
| 130 |
-
|
|
|
|
|
|
|
|
|
|
| 131 |
|
| 132 |
-
affect = classifyAffect(smoothed);
|
| 133 |
gazeBucket = gazeTrackerRef.current.process(landmarks);
|
| 134 |
headSignal = headTrackerRef.current.process(landmarks);
|
| 135 |
headDebugRef.current = headTrackerRef.current.debug;
|
|
@@ -182,6 +196,7 @@ export function useSensing() {
|
|
| 182 |
|
| 183 |
const resetCalibration = useCallback(() => {
|
| 184 |
neutralLCPRef.current = null;
|
|
|
|
| 185 |
smoothedRef.current = { MAR: 0, EAR: 0.3, BRI: -0.3, LCP: 0 };
|
| 186 |
gazeTrackerRef.current.reset();
|
| 187 |
headTrackerRef.current.reset();
|
|
|
|
| 13 |
AirWriter,
|
| 14 |
HeadPoseTracker,
|
| 15 |
} from "../lib/sensing";
|
| 16 |
+
import { DEFAULT_AIR_TEMPLATES } from "../lib/airTemplates";
|
| 17 |
|
| 18 |
const EMA_ALPHA = 0.3;
|
| 19 |
|
|
|
|
| 21 |
const faceLandmarkerRef = useRef<FaceLandmarker | null>(null);
|
| 22 |
const handLandmarkerRef = useRef<HandLandmarker | null>(null);
|
| 23 |
const gazeTrackerRef = useRef(new GazeTracker());
|
| 24 |
+
const airWriterRef = useRef(new AirWriter(DEFAULT_AIR_TEMPLATES));
|
| 25 |
const headTrackerRef = useRef(new HeadPoseTracker());
|
| 26 |
const calibratePendingRef = useRef(false);
|
| 27 |
const headDebugRef = useRef({ dx: 0, dy: 0, maxAbsDx: 0, maxAbsDy: 0, crossings: 0 });
|
| 28 |
const neutralLCPRef = useRef<number | null>(null);
|
| 29 |
+
const calibBufferRef = useRef<number[]>([]);
|
| 30 |
const smoothedRef = useRef({ MAR: 0, EAR: 0.3, BRI: -0.3, LCP: 0 });
|
| 31 |
const initingRef = useRef(false);
|
| 32 |
const [ready, setReady] = useState(false);
|
|
|
|
| 110 |
if (faceResult.faceLandmarks && faceResult.faceLandmarks.length > 0) {
|
| 111 |
const landmarks = faceResult.faceLandmarks[0];
|
| 112 |
|
| 113 |
+
// Average the raw LCP (vertical corner pull, pre-offset) over ~30 frames
|
| 114 |
+
// of the user's face before locking neutral. Single-frame calibration is
|
| 115 |
+
// too noisy and tended to bake in a momentary smile as "neutral".
|
| 116 |
+
// During calibration, affect stays null but gaze/head/gesture still flow.
|
| 117 |
if (neutralLCPRef.current === null) {
|
| 118 |
+
const raw0 = computeAffectVector(landmarks, 0);
|
| 119 |
+
calibBufferRef.current.push(raw0.LCP);
|
| 120 |
+
if (calibBufferRef.current.length >= 30) {
|
| 121 |
+
const sum = calibBufferRef.current.reduce((a, b) => a + b, 0);
|
| 122 |
+
neutralLCPRef.current = sum / calibBufferRef.current.length;
|
| 123 |
+
calibBufferRef.current = [];
|
| 124 |
+
}
|
| 125 |
}
|
| 126 |
|
| 127 |
if (calibratePendingRef.current) {
|
|
|
|
| 129 |
calibratePendingRef.current = false;
|
| 130 |
}
|
| 131 |
|
| 132 |
+
if (neutralLCPRef.current !== null) {
|
| 133 |
+
const raw = computeAffectVector(landmarks, neutralLCPRef.current);
|
| 134 |
|
| 135 |
+
const prev = smoothedRef.current;
|
| 136 |
+
const smoothed = {
|
| 137 |
+
MAR: EMA_ALPHA * raw.MAR + (1 - EMA_ALPHA) * prev.MAR,
|
| 138 |
+
EAR: EMA_ALPHA * raw.EAR + (1 - EMA_ALPHA) * prev.EAR,
|
| 139 |
+
BRI: EMA_ALPHA * raw.BRI + (1 - EMA_ALPHA) * prev.BRI,
|
| 140 |
+
LCP: EMA_ALPHA * raw.LCP + (1 - EMA_ALPHA) * prev.LCP,
|
| 141 |
+
};
|
| 142 |
+
smoothedRef.current = smoothed;
|
| 143 |
+
|
| 144 |
+
affect = classifyAffect(smoothed);
|
| 145 |
+
}
|
| 146 |
|
|
|
|
| 147 |
gazeBucket = gazeTrackerRef.current.process(landmarks);
|
| 148 |
headSignal = headTrackerRef.current.process(landmarks);
|
| 149 |
headDebugRef.current = headTrackerRef.current.debug;
|
|
|
|
| 196 |
|
| 197 |
const resetCalibration = useCallback(() => {
|
| 198 |
neutralLCPRef.current = null;
|
| 199 |
+
calibBufferRef.current = [];
|
| 200 |
smoothedRef.current = { MAR: 0, EAR: 0.3, BRI: -0.3, LCP: 0 };
|
| 201 |
gazeTrackerRef.current.reset();
|
| 202 |
headTrackerRef.current.reset();
|
|
@@ -0,0 +1,108 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
// Default air-writing template bank.
|
| 2 |
+
// Each template is a normalised 32-point [x, y] trajectory (coords in [0, 1]).
|
| 3 |
+
// Matched against live trajectories via DTW in AirWriter.recognise.
|
| 4 |
+
// To add a new template: pick a distinctive *single-stroke* shape,
|
| 5 |
+
// sample ~32 evenly-spaced points from stroke start → end, normalise
|
| 6 |
+
// x/y into [0, 1], and add an entry to DEFAULT_AIR_TEMPLATES.
|
| 7 |
+
//
|
| 8 |
+
// DTW quality tips:
|
| 9 |
+
// - Stick to single-stroke shapes. Multi-stroke shapes (like an X) look
|
| 10 |
+
// like a teleport to DTW and will mis-match.
|
| 11 |
+
// - Shapes should be distinctive in direction and extent — a small
|
| 12 |
+
// check-mark and a big slash look similar after normalisation.
|
| 13 |
+
|
| 14 |
+
function linear(from: [number, number], to: [number, number], n: number): [number, number][] {
|
| 15 |
+
const out: [number, number][] = [];
|
| 16 |
+
for (let i = 0; i < n; i++) {
|
| 17 |
+
const t = i / (n - 1);
|
| 18 |
+
out.push([from[0] + t * (to[0] - from[0]), from[1] + t * (to[1] - from[1])]);
|
| 19 |
+
}
|
| 20 |
+
return out;
|
| 21 |
+
}
|
| 22 |
+
|
| 23 |
+
function concat(...segs: [number, number][][]): [number, number][] {
|
| 24 |
+
const out: [number, number][] = [];
|
| 25 |
+
for (const s of segs) out.push(...s);
|
| 26 |
+
return resample(out, 32);
|
| 27 |
+
}
|
| 28 |
+
|
| 29 |
+
function resample(pts: [number, number][], n: number): [number, number][] {
|
| 30 |
+
if (pts.length < 2) return pts;
|
| 31 |
+
const out: [number, number][] = [];
|
| 32 |
+
for (let i = 0; i < n; i++) {
|
| 33 |
+
const t = (i / (n - 1)) * (pts.length - 1);
|
| 34 |
+
const lo = Math.floor(t);
|
| 35 |
+
const hi = Math.min(lo + 1, pts.length - 1);
|
| 36 |
+
const frac = t - lo;
|
| 37 |
+
out.push([
|
| 38 |
+
pts[lo][0] + frac * (pts[hi][0] - pts[lo][0]),
|
| 39 |
+
pts[lo][1] + frac * (pts[hi][1] - pts[lo][1]),
|
| 40 |
+
]);
|
| 41 |
+
}
|
| 42 |
+
return out;
|
| 43 |
+
}
|
| 44 |
+
|
| 45 |
+
// check-mark: short down-right, then long up-right → affirmation
|
| 46 |
+
const YES: [number, number][] = concat(
|
| 47 |
+
linear([0.0, 0.5], [0.35, 1.0], 12),
|
| 48 |
+
linear([0.35, 1.0], [1.0, 0.0], 20)
|
| 49 |
+
);
|
| 50 |
+
|
| 51 |
+
// question-mark: curve over the top, then down to the dot → clarifying
|
| 52 |
+
const QUESTION: [number, number][] = concat(
|
| 53 |
+
linear([0.1, 0.25], [0.5, 0.0], 8),
|
| 54 |
+
linear([0.5, 0.0], [0.9, 0.25], 8),
|
| 55 |
+
linear([0.9, 0.25], [0.5, 0.55], 8),
|
| 56 |
+
linear([0.5, 0.55], [0.5, 1.0], 8)
|
| 57 |
+
);
|
| 58 |
+
|
| 59 |
+
// zig-zag wave across the top → greeting
|
| 60 |
+
const HI: [number, number][] = concat(
|
| 61 |
+
linear([0.0, 0.0], [0.25, 1.0], 8),
|
| 62 |
+
linear([0.25, 1.0], [0.5, 0.0], 8),
|
| 63 |
+
linear([0.5, 0.0], [0.75, 1.0], 8),
|
| 64 |
+
linear([0.75, 1.0], [1.0, 0.0], 8)
|
| 65 |
+
);
|
| 66 |
+
|
| 67 |
+
// straight vertical line bottom→top → "help" (raise hand / SOS mental model)
|
| 68 |
+
const HELP: [number, number][] = linear([0.5, 1.0], [0.5, 0.0], 32);
|
| 69 |
+
|
| 70 |
+
// horizontal line left→right → "done" (close / finish)
|
| 71 |
+
const DONE: [number, number][] = linear([0.0, 0.5], [1.0, 0.5], 32);
|
| 72 |
+
|
| 73 |
+
// plus-sign-ish as a single stroke: long down, backtrack up, then across → "more"
|
| 74 |
+
// mimics drawing "+" as one continuous stroke (down, back, right)
|
| 75 |
+
const MORE: [number, number][] = concat(
|
| 76 |
+
linear([0.5, 0.0], [0.5, 1.0], 12),
|
| 77 |
+
linear([0.5, 1.0], [0.5, 0.5], 6),
|
| 78 |
+
linear([0.5, 0.5], [1.0, 0.5], 14)
|
| 79 |
+
);
|
| 80 |
+
|
| 81 |
+
// single wave (down-up-down-up smooth) → "water" (fluid/ocean mental model)
|
| 82 |
+
const WATER: [number, number][] = concat(
|
| 83 |
+
linear([0.0, 0.5], [0.2, 0.9], 6),
|
| 84 |
+
linear([0.2, 0.9], [0.4, 0.1], 8),
|
| 85 |
+
linear([0.4, 0.1], [0.6, 0.9], 8),
|
| 86 |
+
linear([0.6, 0.9], [0.8, 0.1], 8),
|
| 87 |
+
linear([0.8, 0.1], [1.0, 0.5], 2)
|
| 88 |
+
);
|
| 89 |
+
|
| 90 |
+
// square/box (traced as one stroke) → "stop"
|
| 91 |
+
// start top-left, go right, down, left, up — closing the box
|
| 92 |
+
const STOP: [number, number][] = concat(
|
| 93 |
+
linear([0.0, 0.0], [1.0, 0.0], 8),
|
| 94 |
+
linear([1.0, 0.0], [1.0, 1.0], 8),
|
| 95 |
+
linear([1.0, 1.0], [0.0, 1.0], 8),
|
| 96 |
+
linear([0.0, 1.0], [0.0, 0.0], 8)
|
| 97 |
+
);
|
| 98 |
+
|
| 99 |
+
export const DEFAULT_AIR_TEMPLATES: Map<string, [number, number][]> = new Map([
|
| 100 |
+
["yes", YES],
|
| 101 |
+
["?", QUESTION],
|
| 102 |
+
["hi", HI],
|
| 103 |
+
["help", HELP],
|
| 104 |
+
["done", DONE],
|
| 105 |
+
["more", MORE],
|
| 106 |
+
["water", WATER],
|
| 107 |
+
["stop", STOP],
|
| 108 |
+
]);
|
|
@@ -11,12 +11,16 @@ interface AffectVector {
|
|
| 11 |
|
| 12 |
export function classifyAffect(v: AffectVector): Affect {
|
| 13 |
// BRI is relative (browMid.y - eyeCenter.y) / interOcular — more negative = brows raised higher
|
| 14 |
-
// LCP is
|
|
|
|
| 15 |
// MAR is absolute ratio — higher = mouth more open
|
| 16 |
-
// EAR is absolute ratio — lower = eyes more closed
|
| 17 |
if (v.BRI < -0.35 && v.MAR > 0.4) return "SURPRISED";
|
| 18 |
-
|
| 19 |
-
if (v.LCP
|
|
|
|
|
|
|
|
|
|
| 20 |
return "NEUTRAL";
|
| 21 |
}
|
| 22 |
|
|
@@ -55,8 +59,14 @@ export function computeAffectVector(
|
|
| 55 |
// Raising brows moves them toward y=0, making this value more negative.
|
| 56 |
const BRI = (browMid.y - eyeCenter.y) / (interOcular + 1e-6);
|
| 57 |
|
| 58 |
-
|
| 59 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
|
| 61 |
return { MAR, EAR, BRI, LCP };
|
| 62 |
}
|
|
@@ -524,7 +534,13 @@ export class AirWriter {
|
|
| 524 |
}
|
| 525 |
|
| 526 |
private recognise(trajectory: [number, number][]): string | null {
|
| 527 |
-
if (trajectory.length < 5
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 528 |
const query = normaliseTrajectory(trajectory);
|
| 529 |
let bestChar: string | null = null;
|
| 530 |
let bestDist = Infinity;
|
|
@@ -535,6 +551,15 @@ export class AirWriter {
|
|
| 535 |
bestChar = char;
|
| 536 |
}
|
| 537 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 538 |
return bestChar;
|
| 539 |
}
|
| 540 |
|
|
|
|
| 11 |
|
| 12 |
export function classifyAffect(v: AffectVector): Affect {
|
| 13 |
// BRI is relative (browMid.y - eyeCenter.y) / interOcular — more negative = brows raised higher
|
| 14 |
+
// LCP is vertical offset of lip corners from mouth center, normalised by inter-ocular,
|
| 15 |
+
// relative to calibrated neutral — positive = corners pulled UP (smile), negative = DOWN (frown)
|
| 16 |
// MAR is absolute ratio — higher = mouth more open
|
| 17 |
+
// EAR is absolute ratio — lower = eyes more closed / squinting
|
| 18 |
if (v.BRI < -0.35 && v.MAR > 0.4) return "SURPRISED";
|
| 19 |
+
// FRUSTRATED: a clear frown, OR brows lowered + squinting — either signals displeasure
|
| 20 |
+
if (v.LCP < -0.015) return "FRUSTRATED";
|
| 21 |
+
if (v.BRI > -0.2 && v.EAR < 0.18) return "FRUSTRATED";
|
| 22 |
+
// HAPPY: meaningful upward pull of lip corners (tighter than the old 0.005)
|
| 23 |
+
if (v.LCP > 0.015) return "HAPPY";
|
| 24 |
return "NEUTRAL";
|
| 25 |
}
|
| 26 |
|
|
|
|
| 59 |
// Raising brows moves them toward y=0, making this value more negative.
|
| 60 |
const BRI = (browMid.y - eyeCenter.y) / (interOcular + 1e-6);
|
| 61 |
|
| 62 |
+
// Lip-corner pull: average y of the two corners vs. mouth vertical centre,
|
| 63 |
+
// normalised by inter-ocular distance, relative to calibrated neutral.
|
| 64 |
+
// MediaPipe y increases downward, so corners rising above the mouth centre → negative raw,
|
| 65 |
+
// which we flip so smile = positive. Subtracting the calibrated neutral removes per-face bias.
|
| 66 |
+
const mouthCentreY = (landmarks[MOUTH_TOP].y + landmarks[MOUTH_BOTTOM].y) / 2;
|
| 67 |
+
const cornerAvgY = (landmarks[CORNER_LEFT].y + landmarks[CORNER_RIGHT].y) / 2;
|
| 68 |
+
const rawLCP = (mouthCentreY - cornerAvgY) / (interOcular + 1e-6);
|
| 69 |
+
const LCP = rawLCP - neutralLCP;
|
| 70 |
|
| 71 |
return { MAR, EAR, BRI, LCP };
|
| 72 |
}
|
|
|
|
| 534 |
}
|
| 535 |
|
| 536 |
private recognise(trajectory: [number, number][]): string | null {
|
| 537 |
+
if (trajectory.length < 5) {
|
| 538 |
+
return null;
|
| 539 |
+
}
|
| 540 |
+
if (this.templates.size === 0) {
|
| 541 |
+
console.debug("[AirWriter] stroke completed but template bank is empty");
|
| 542 |
+
return null;
|
| 543 |
+
}
|
| 544 |
const query = normaliseTrajectory(trajectory);
|
| 545 |
let bestChar: string | null = null;
|
| 546 |
let bestDist = Infinity;
|
|
|
|
| 551 |
bestChar = char;
|
| 552 |
}
|
| 553 |
}
|
| 554 |
+
// Reject poor matches so we don't pass garbage to the LLM.
|
| 555 |
+
// Threshold is empirical — tune once real users test this.
|
| 556 |
+
const MATCH_THRESHOLD = 8.0;
|
| 557 |
+
if (bestDist > MATCH_THRESHOLD) {
|
| 558 |
+
console.debug(
|
| 559 |
+
`[AirWriter] no template matched (best='${bestChar}', dist=${bestDist.toFixed(2)})`
|
| 560 |
+
);
|
| 561 |
+
return null;
|
| 562 |
+
}
|
| 563 |
return bestChar;
|
| 564 |
}
|
| 565 |
|