shwetangisingh commited on
Commit
978ca55
·
1 Parent(s): 5187368

Modified UI + Readme Update

Browse files
README.md CHANGED
@@ -187,106 +187,7 @@ To add a new persona, edit `data/generate_users.py` and re-run `python -m backen
187
 
188
  ## TODO
189
 
190
- <<<<<<< Updated upstream
191
  From the spec (pages 10–11). Tags: **[Core]** = must do, **[Bonus]** = nice to have, **[Eval]** = for the grade.
192
- =======
193
- Roadmap derived from the project spec (pages 10–11). Items are grouped by spec
194
- area and marked with priority. Bracketed tags map back to the spec:
195
- **[Core]** = required deliverable, **[Bonus]** = stretch goal, **[Eval]** = validation.
196
-
197
- > **Note on sensing:** all camera capture and signal classification happens in
198
- > the **frontend** (MediaPipe JS). The backend only consumes pre-classified
199
- > labels (`affect`, `gesture_tag`, `gaze_bucket`).
200
-
201
- ### Dataset
202
-
203
- - [ ] **[Core]** Add **heterogeneous** memory types per persona — currently only
204
- autobiographical narratives exist.
205
- - [ ] Add a set of synthetic social-media posts per persona (voice-matched)
206
- - [ ] Add a set of synthetic past communication logs per persona
207
- - [ ] Regenerate the synthesis script to produce both, then rebuild embeddings
208
- - [ ] Make ingestion type-aware so the retriever knows which chunk-type a hit came from
209
- - [ ] **[Core]** Document the dataset schema so it is reusable by the evaluation harness.
210
-
211
- ### Multimodal Sensing (frontend)
212
-
213
- - [ ] **[Core]** Detect **head-nod / sharp tilt as dissatisfaction**, distinct
214
- from a generic frustrated affect read.
215
- - [ ] Send a `dissatisfaction_signal` to the backend alongside the existing labels
216
- - [ ] When the signal fires, branch the planner to a **"Turnaround Option"** —
217
- a clarification candidate ("Did you mean X or Y?") instead of a plain answer
218
- - [ ] **[Bonus]** Add **vocalisation capture** (Web Speech API) and a
219
- **conflict-resolution** step that compares the spoken intent against the
220
- air-written intent, sending a single `resolved_intent` to the backend.
221
- - [ ] **[Polish]** Tighten the **thumbs-up boost** — today it only annotates the
222
- prompt. The retriever should also bias affirmative-leaning candidates when
223
- a thumbs-up is present.
224
-
225
- ### Agentic Intent Decomposition
226
-
227
- > **Current state:** intent routing is **keyword-based**, not LLM-based.
228
- > The original LLM-driven router (Pydantic-validated JSON output) was
229
- > dropped because `gemma4:31b-cloud` consistently emitted the wrong JSON
230
- > shape and got truncated by `max_tokens`, triggering 3 retries + a
231
- > hard-fallback on every turn — adding ~30s of dead latency before the
232
- > generation call. The keyword router (~5 buckets matched against
233
- > hardcoded word lists in `intent.py`) handles the demo personas
234
- > reliably and adds ~0ms per turn.
235
- >
236
- > **Trade-off:** the router is limited to the 5 hardcoded buckets
237
- > (`family`, `medical`, `hobbies`, `daily_routine`, `social`) and can't
238
- > distinguish `OPEN_DOMAIN` from `PERSONAL` queries. Acceptable today
239
- > because all current personas only have personal memories.
240
-
241
- - [ ] **[Core]** Make Personal / Contextual / Open-domain routing actually hit
242
- **different retrieval pools** — today all sub-queries fall back to the same
243
- vector index. Requires re-introducing some form of intent classification
244
- (likely a constrained-output LLM call once `response_format=json_schema`
245
- is supported on Ollama Cloud, or a tiny local classifier).
246
- - [ ] **[Perf]** When/if we re-add LLM intent: cache the schema prompt,
247
- use a smaller routing model, and parallelise sub-query retrieval.
248
-
249
- ### Retrieval
250
-
251
- - [ ] **[Bonus]** Persist **bucket priors** per user across conversations
252
- (currently per-session only).
253
- - [ ] **[Bonus]** Extend the **latency-optimised fallback** beyond a single
254
- LLM-tier switch:
255
- - [ ] Return a cached canned response when end-to-end latency blows the budget
256
- - [ ] Use the spec's **< 6s end-to-end** target instead of the current 3.5s threshold
257
- - [ ] **[Scale]** When per-user memory grows past ~100k chunks, swap the
258
- torch-tensor matmul search for `hnswlib` (a ~2 MB approximate-NN library);
259
- reintroduce a cross-encoder reranker once `top_k > ~30`.
260
-
261
- ### Training-Free Response Generation
262
-
263
- - [ ] **[Core]** Return **multiple candidate responses** from the API so the
264
- user can pick one (today the endpoint returns a single string).
265
- - [ ] **[Bonus]** On user selection, upsert the `(query, selected_response)` pair
266
- into a small "accepted-pairs" index and consult it as a high-prior shortcut
267
- on the next turn — the spec's lightweight retrieval-index update.
268
-
269
- ### Evaluation & Validation
270
-
271
- - [ ] **[Eval]** **Factual Faithfulness** — NLI-based groundedness metric over
272
- (retrieved evidence, generated response) pairs, reported as a hallucination
273
- rate on a held-out set of partner-style queries per persona.
274
- - [ ] **[Eval]** **Communication Efficiency** — p50 / p95 end-to-end latency
275
- across all three LLM tiers, with a pass/fail gate at the spec target of
276
- **< 6s p95**.
277
- - [ ] **[Eval]** **Perceived Authenticity** — generate paired (persona, query,
278
- response) samples and a 5-point Likert rating sheet for the live in-class eval.
279
- - [ ] **[Eval]** **Multimodal Alignment** — synthetic (gesture, query) scenarios
280
- checked against expected response traits (e.g. thumbs-up ⇒ affirmative
281
- lexicon present), reported as alignment accuracy.
282
-
283
- ### Polish
284
-
285
- - [ ] **[Polish]** Move the hard-coded affect→tone and persona-override dicts
286
- into a single YAML so tone-shaping can be tuned without touching code.
287
- - [x] **[Polish]** Delete the unused `backend/sensing/` Python modules now that
288
- sensing lives entirely in the frontend. *(Done — only `labels.py` remains.)*
289
- >>>>>>> Stashed changes
290
 
291
  Heads up: all camera/sensing stuff is in the frontend (MediaPipe JS). Backend just gets the labels (`affect`, `gesture_tag`, `gaze_bucket`). The `backend/sensing/` python modules are dead code.
292
 
@@ -311,6 +212,8 @@ Heads up: all camera/sensing stuff is in the frontend (MediaPipe JS). Backend ju
311
 
312
  ### Intent decomposition
313
 
 
 
314
  - [ ] **[Core]** Personal / Contextual / Open-domain all hit the same FAISS index right now. Make them actually go different places — open-domain → web search (or stub), contextual → session memory
315
  - [ ] intent node is slow. Cache the prompt, use a tiny model for routing, parallelise the sub-queries
316
 
@@ -321,6 +224,7 @@ Heads up: all camera/sensing stuff is in the frontend (MediaPipe JS). Backend ju
321
  - drop reranker if retrieval is slow
322
  - return a canned response if we blow the budget entirely
323
  - threshold is 3.5s, spec says 6s — pick one
 
324
 
325
  ### Generation
326
 
@@ -349,7 +253,7 @@ Live per-turn scores show up in the `EvalPanel`. State:
349
  ### Cleanup
350
 
351
  - [ ] move the affect→tone / persona override dicts out of code into a yaml
352
- - [ ] delete `backend/sensing/` (dead code, sensing is in frontend)
353
 
354
  ---
355
 
 
187
 
188
  ## TODO
189
 
 
190
  From the spec (pages 10–11). Tags: **[Core]** = must do, **[Bonus]** = nice to have, **[Eval]** = for the grade.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
191
 
192
  Heads up: all camera/sensing stuff is in the frontend (MediaPipe JS). Backend just gets the labels (`affect`, `gesture_tag`, `gaze_bucket`). The `backend/sensing/` python modules are dead code.
193
 
 
212
 
213
  ### Intent decomposition
214
 
215
+ > Current state: routing is keyword-based, not LLM-based. The original LLM router (Pydantic-validated JSON) kept emitting the wrong shape with `gemma4:31b-cloud` and hitting the `max_tokens` truncation — 3 retries + hard fallback on every turn, ~30s of dead latency before generation. The keyword router (5 buckets matched against word lists in `intent.py`) handles the demo personas and adds ~0ms. Trade-off: stuck with the 5 hardcoded buckets (`family`, `medical`, `hobbies`, `daily_routine`, `social`) and can't tell `OPEN_DOMAIN` from `PERSONAL`. Fine for now since all personas only have personal memories. Revisit when Ollama Cloud ships `response_format=json_schema` or we add a tiny local classifier.
216
+
217
  - [ ] **[Core]** Personal / Contextual / Open-domain all hit the same FAISS index right now. Make them actually go different places — open-domain → web search (or stub), contextual → session memory
218
  - [ ] intent node is slow. Cache the prompt, use a tiny model for routing, parallelise the sub-queries
219
 
 
224
  - drop reranker if retrieval is slow
225
  - return a canned response if we blow the budget entirely
226
  - threshold is 3.5s, spec says 6s — pick one
227
+ - [ ] **[Scale]** past ~100k chunks per user, swap torch matmul for `hnswlib`; add a reranker if top_k grows past ~30
228
 
229
  ### Generation
230
 
 
253
  ### Cleanup
254
 
255
  - [ ] move the affect→tone / persona override dicts out of code into a yaml
256
+ - [x] delete `backend/sensing/` (dead code, sensing is in frontend) — done, only `labels.py` remains
257
 
258
  ---
259
 
frontend/index.html CHANGED
@@ -4,7 +4,7 @@
4
  <meta charset="UTF-8" />
5
  <link rel="icon" type="image/svg+xml" href="/favicon.svg" />
6
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
7
- <title>frontend</title>
8
  </head>
9
  <body>
10
  <div id="root"></div>
 
4
  <meta charset="UTF-8" />
5
  <link rel="icon" type="image/svg+xml" href="/favicon.svg" />
6
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
7
+ <title>AAC Chatbot</title>
8
  </head>
9
  <body>
10
  <div id="root"></div>
frontend/src/App.css CHANGED
@@ -5,9 +5,9 @@
5
  }
6
 
7
  body {
8
- font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif;
9
- background: #0f1117;
10
- color: #e0e0e0;
11
  }
12
 
13
  .app-layout {
@@ -15,29 +15,29 @@ body {
15
  height: 100vh;
16
  }
17
 
18
- /* ── Sidebar ──────────────────────────────────────────────────────────── */
19
-
20
  .sidebar {
21
  width: 320px;
22
- background: #1a1d27;
23
- padding: 20px;
24
  display: flex;
25
  flex-direction: column;
26
- gap: 16px;
27
  overflow-y: auto;
28
- border-right: 1px solid #2a2d37;
29
  }
30
 
31
  .app-title {
32
- font-size: 20px;
33
- font-weight: 600;
34
- color: #fff;
 
 
35
  }
36
 
37
  .sidebar-section {
38
  display: flex;
39
  flex-direction: column;
40
- gap: 8px;
41
  }
42
 
43
  .toggle-label {
@@ -46,67 +46,88 @@ body {
46
  gap: 8px;
47
  cursor: pointer;
48
  font-size: 14px;
 
 
 
49
  }
50
 
51
- /* ── Forms ─────────────────────────────────────────────────────────── */
 
 
 
 
 
52
 
53
  .persona-selector {
54
  display: flex;
55
  flex-direction: column;
56
- gap: 4px;
57
  }
58
 
59
  label {
60
  font-size: 12px;
61
- color: #888;
62
- text-transform: uppercase;
63
- letter-spacing: 0.5px;
 
64
  }
65
 
66
- select, input[type="text"] {
67
- background: #252830;
68
- color: #e0e0e0;
69
- border: 1px solid #3a3d47;
70
- border-radius: 6px;
71
- padding: 8px 10px;
 
72
  font-size: 14px;
 
73
  outline: none;
 
74
  }
75
 
76
- select:focus, input[type="text"]:focus {
77
- border-color: #5b8def;
 
 
78
  }
79
 
80
- /* ── Webcam ────────────────────────────────────────────────────────── */
 
 
 
81
 
82
  .webcam-container {
83
- border-radius: 8px;
84
  overflow: hidden;
85
- background: #252830;
 
86
  }
87
 
88
- .webcam-placeholder, .webcam-error {
 
89
  padding: 24px;
90
  text-align: center;
91
- color: #666;
92
  font-size: 13px;
93
  }
94
 
95
  .webcam-error {
96
- color: #e55;
97
  }
98
 
99
- /* ── Sensing status ───────────────────────────────────────────────── */
100
-
101
  .sensing-off {
102
- color: #666;
103
  font-size: 13px;
104
  }
105
 
106
  .sensing-status {
107
  display: flex;
108
  flex-direction: column;
109
- gap: 4px;
 
 
 
 
110
  }
111
 
112
  .sensing-row {
@@ -116,21 +137,19 @@ select:focus, input[type="text"]:focus {
116
  }
117
 
118
  .sensing-label {
119
- color: #888;
120
  }
121
 
122
  .sensing-value {
123
- color: #ccc;
124
  font-weight: 500;
125
  }
126
 
127
- /* ── Shared metric primitives (latency, eval) ─────────────────────── */
128
-
129
  .section-title {
130
- font-size: 12px;
131
- color: #888;
132
  text-transform: uppercase;
133
- letter-spacing: 0.5px;
134
  margin-bottom: 6px;
135
  font-weight: 600;
136
  }
@@ -140,47 +159,46 @@ select:focus, input[type="text"]:focus {
140
  justify-content: space-between;
141
  font-size: 13px;
142
  padding: 2px 0;
143
- color: #aaa;
144
  }
145
 
146
  .metric-row.sub {
147
  padding-left: 12px;
148
  font-size: 11px;
149
- color: #777;
150
  }
151
 
152
  .metric-label {
153
- color: #888;
154
  }
155
 
156
  .metric-value {
157
- color: #ccc;
158
- font-family: monospace;
 
159
  }
160
 
161
  .metric-value.pass {
162
- color: #4caf50;
163
  }
164
 
165
  .metric-value.fail {
166
- color: #f44336;
167
  }
168
 
169
  .no-metrics {
170
- color: #555;
171
  font-size: 13px;
 
172
  }
173
 
174
- /* ── Main content ─────────────────────────────────────────────────── */
175
-
176
  .main-content {
177
  flex: 1;
178
  display: flex;
179
  flex-direction: column;
 
180
  }
181
 
182
- /* ── Chat panel ───────────────────────────────────────────────────── */
183
-
184
  .chat-panel {
185
  display: flex;
186
  flex-direction: column;
@@ -188,53 +206,96 @@ select:focus, input[type="text"]:focus {
188
  }
189
 
190
  .chat-header {
191
- padding: 16px 24px;
192
- font-size: 16px;
193
- font-weight: 600;
194
- border-bottom: 1px solid #2a2d37;
195
- background: #1a1d27;
 
 
 
196
  }
197
 
198
  .chat-messages {
199
  flex: 1;
200
  overflow-y: auto;
201
- padding: 20px 24px;
202
  display: flex;
203
  flex-direction: column;
204
- gap: 12px;
205
  }
206
 
207
  .chat-bubble {
208
- max-width: 75%;
209
- padding: 10px 14px;
210
- border-radius: 12px;
211
- font-size: 14px;
212
- line-height: 1.5;
213
  }
214
 
215
  .chat-bubble.partner {
216
  align-self: flex-end;
217
- background: #2a4a8a;
218
- border-bottom-right-radius: 4px;
 
 
 
 
 
219
  }
220
 
221
  .chat-bubble.aac_user {
222
  align-self: flex-start;
223
- background: #252830;
224
- border-bottom-left-radius: 4px;
 
 
225
  }
226
 
227
  .chat-bubble.loading {
228
- opacity: 0.6;
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
229
  }
230
 
231
  .chat-role {
232
  display: block;
233
  font-size: 11px;
234
- color: #888;
235
  margin-bottom: 4px;
236
  text-transform: uppercase;
237
  letter-spacing: 0.5px;
 
238
  }
239
 
240
  .chat-bubble p {
@@ -243,53 +304,57 @@ select:focus, input[type="text"]:focus {
243
 
244
  .chat-input-row {
245
  display: flex;
246
- gap: 8px;
247
- padding: 16px 24px;
248
- border-top: 1px solid #2a2d37;
249
- background: #1a1d27;
250
  }
251
 
252
  .chat-input-row input {
253
  flex: 1;
 
 
254
  }
255
 
256
  .chat-input-row button {
257
- background: #5b8def;
258
- color: #fff;
259
  border: none;
260
- border-radius: 6px;
261
- padding: 8px 20px;
262
  font-size: 14px;
 
 
263
  cursor: pointer;
 
264
  }
265
 
266
  .chat-input-row button:disabled {
267
- opacity: 0.4;
268
  cursor: not-allowed;
269
  }
270
 
271
  .chat-input-row button:hover:not(:disabled) {
272
- background: #4a7cde;
273
  }
274
 
275
  .error {
276
- color: #e55;
277
  font-size: 13px;
278
  }
279
 
280
- /* ── Eval panel ──────────────────────────────────────────────────── */
281
-
282
  .eval-panel {
283
- margin-top: 8px;
284
- border-top: 1px solid #3a3d47;
285
- padding-top: 6px;
286
  }
287
 
288
  .eval-toggle {
289
  background: none;
290
  border: none;
291
- color: #888;
292
  font-size: 12px;
 
293
  cursor: pointer;
294
  padding: 2px 0;
295
  display: flex;
@@ -298,31 +363,32 @@ select:focus, input[type="text"]:focus {
298
  }
299
 
300
  .eval-toggle:hover {
301
- color: #bbb;
302
  }
303
 
304
  .slo-badge {
305
  font-size: 10px;
306
- padding: 1px 6px;
307
- border-radius: 3px;
308
  font-weight: 600;
 
309
  }
310
 
311
  .slo-badge.pass {
312
- background: #1b3a1b;
313
- color: #4caf50;
314
  }
315
 
316
  .slo-badge.fail {
317
- background: #3a1b1b;
318
- color: #f44336;
319
  }
320
 
321
  .eval-details {
322
  display: flex;
323
  flex-direction: column;
324
- gap: 10px;
325
- margin-top: 8px;
326
  }
327
 
328
  .eval-section {
@@ -333,25 +399,23 @@ select:focus, input[type="text"]:focus {
333
 
334
  .eval-na {
335
  font-size: 11px;
336
- color: #666;
337
  font-style: italic;
338
  }
339
 
340
  .score-bar {
341
  height: 4px;
342
- background: #2a2d37;
343
- border-radius: 2px;
344
  overflow: hidden;
345
  }
346
 
347
  .score-bar-fill {
348
  height: 100%;
349
- border-radius: 2px;
350
  transition: width 0.3s ease;
351
  }
352
 
353
- /* ── Star rating ─────────────────────────────────────────────────── */
354
-
355
  .star-rating {
356
  display: flex;
357
  align-items: center;
@@ -363,22 +427,40 @@ select:focus, input[type="text"]:focus {
363
  border: none;
364
  font-size: 18px;
365
  cursor: pointer;
366
- color: #3a3d47;
367
  padding: 0;
368
  line-height: 1;
369
  transition: color 0.15s;
370
  }
371
 
372
  .star-rating .star.active {
373
- color: #ff9800;
374
  }
375
 
376
  .star-rating .star:hover {
377
- color: #ffb74d;
378
  }
379
 
380
  .star-label {
381
  font-size: 11px;
382
- color: #888;
383
  margin-left: 6px;
384
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  }
6
 
7
  body {
8
+ font-family: var(--sans);
9
+ background: var(--bg);
10
+ color: var(--text);
11
  }
12
 
13
  .app-layout {
 
15
  height: 100vh;
16
  }
17
 
 
 
18
  .sidebar {
19
  width: 320px;
20
+ background: var(--sidebar-bg);
21
+ padding: 24px 20px;
22
  display: flex;
23
  flex-direction: column;
24
+ gap: 20px;
25
  overflow-y: auto;
26
+ border-right: 1px solid var(--border);
27
  }
28
 
29
  .app-title {
30
+ font-family: var(--heading);
31
+ font-size: 24px;
32
+ font-weight: 500;
33
+ letter-spacing: -0.3px;
34
+ color: var(--text);
35
  }
36
 
37
  .sidebar-section {
38
  display: flex;
39
  flex-direction: column;
40
+ gap: 10px;
41
  }
42
 
43
  .toggle-label {
 
46
  gap: 8px;
47
  cursor: pointer;
48
  font-size: 14px;
49
+ color: var(--text);
50
+ text-transform: none;
51
+ letter-spacing: normal;
52
  }
53
 
54
+ .toggle-label input[type="checkbox"] {
55
+ accent-color: var(--accent);
56
+ width: 15px;
57
+ height: 15px;
58
+ cursor: pointer;
59
+ }
60
 
61
  .persona-selector {
62
  display: flex;
63
  flex-direction: column;
64
+ gap: 6px;
65
  }
66
 
67
  label {
68
  font-size: 12px;
69
+ color: var(--text-muted);
70
+ font-weight: 500;
71
+ letter-spacing: normal;
72
+ text-transform: none;
73
  }
74
 
75
+ select,
76
+ input[type="text"] {
77
+ background: var(--surface);
78
+ color: var(--text);
79
+ border: 1px solid var(--border-strong);
80
+ border-radius: 10px;
81
+ padding: 9px 12px;
82
  font-size: 14px;
83
+ font-family: var(--sans);
84
  outline: none;
85
+ transition: border-color 0.15s, box-shadow 0.15s;
86
  }
87
 
88
+ select:focus,
89
+ input[type="text"]:focus {
90
+ border-color: var(--accent);
91
+ box-shadow: 0 0 0 3px var(--accent-soft);
92
  }
93
 
94
+ select:hover,
95
+ input[type="text"]:hover {
96
+ border-color: var(--text-faint);
97
+ }
98
 
99
  .webcam-container {
100
+ border-radius: 12px;
101
  overflow: hidden;
102
+ background: var(--surface);
103
+ border: 1px solid var(--border);
104
  }
105
 
106
+ .webcam-placeholder,
107
+ .webcam-error {
108
  padding: 24px;
109
  text-align: center;
110
+ color: var(--text-faint);
111
  font-size: 13px;
112
  }
113
 
114
  .webcam-error {
115
+ color: var(--fail);
116
  }
117
 
 
 
118
  .sensing-off {
119
+ color: var(--text-faint);
120
  font-size: 13px;
121
  }
122
 
123
  .sensing-status {
124
  display: flex;
125
  flex-direction: column;
126
+ gap: 6px;
127
+ padding: 10px 12px;
128
+ background: var(--surface);
129
+ border: 1px solid var(--border);
130
+ border-radius: 10px;
131
  }
132
 
133
  .sensing-row {
 
137
  }
138
 
139
  .sensing-label {
140
+ color: var(--text-muted);
141
  }
142
 
143
  .sensing-value {
144
+ color: var(--text);
145
  font-weight: 500;
146
  }
147
 
 
 
148
  .section-title {
149
+ font-size: 11px;
150
+ color: var(--text-muted);
151
  text-transform: uppercase;
152
+ letter-spacing: 0.6px;
153
  margin-bottom: 6px;
154
  font-weight: 600;
155
  }
 
159
  justify-content: space-between;
160
  font-size: 13px;
161
  padding: 2px 0;
162
+ color: var(--text);
163
  }
164
 
165
  .metric-row.sub {
166
  padding-left: 12px;
167
  font-size: 11px;
168
+ color: var(--text-muted);
169
  }
170
 
171
  .metric-label {
172
+ color: var(--text-muted);
173
  }
174
 
175
  .metric-value {
176
+ color: var(--text);
177
+ font-family: var(--mono);
178
+ font-size: 12px;
179
  }
180
 
181
  .metric-value.pass {
182
+ color: var(--pass);
183
  }
184
 
185
  .metric-value.fail {
186
+ color: var(--fail);
187
  }
188
 
189
  .no-metrics {
190
+ color: var(--text-faint);
191
  font-size: 13px;
192
+ font-style: italic;
193
  }
194
 
 
 
195
  .main-content {
196
  flex: 1;
197
  display: flex;
198
  flex-direction: column;
199
+ background: var(--bg);
200
  }
201
 
 
 
202
  .chat-panel {
203
  display: flex;
204
  flex-direction: column;
 
206
  }
207
 
208
  .chat-header {
209
+ padding: 18px 28px;
210
+ font-family: var(--heading);
211
+ font-size: 18px;
212
+ font-weight: 500;
213
+ letter-spacing: -0.2px;
214
+ color: var(--text);
215
+ border-bottom: 1px solid var(--border);
216
+ background: var(--bg);
217
  }
218
 
219
  .chat-messages {
220
  flex: 1;
221
  overflow-y: auto;
222
+ padding: 28px 32px;
223
  display: flex;
224
  flex-direction: column;
225
+ gap: 16px;
226
  }
227
 
228
  .chat-bubble {
229
+ max-width: 72%;
230
+ padding: 12px 16px;
231
+ border-radius: 16px;
232
+ font-size: 15px;
233
+ line-height: 1.55;
234
  }
235
 
236
  .chat-bubble.partner {
237
  align-self: flex-end;
238
+ background: var(--accent);
239
+ color: #ffffff;
240
+ border-bottom-right-radius: 6px;
241
+ }
242
+
243
+ .chat-bubble.partner .chat-role {
244
+ color: rgba(255, 255, 255, 0.85);
245
  }
246
 
247
  .chat-bubble.aac_user {
248
  align-self: flex-start;
249
+ background: var(--surface);
250
+ color: var(--text);
251
+ border: 1px solid var(--border);
252
+ border-bottom-left-radius: 6px;
253
  }
254
 
255
  .chat-bubble.loading {
256
+ background: var(--surface);
257
+ border: 1px solid var(--border);
258
+ }
259
+
260
+ .chat-bubble.loading p {
261
+ display: inline-block;
262
+ background: linear-gradient(
263
+ 90deg,
264
+ var(--text-muted) 0%,
265
+ var(--accent) 45%,
266
+ var(--accent) 55%,
267
+ var(--text-muted) 100%
268
+ );
269
+ background-size: 200% 100%;
270
+ -webkit-background-clip: text;
271
+ background-clip: text;
272
+ color: transparent;
273
+ animation: shimmer 1.8s ease-in-out infinite;
274
+ font-weight: 500;
275
+ }
276
+
277
+ @keyframes shimmer {
278
+ 0% { background-position: 200% 0; }
279
+ 100% { background-position: -200% 0; }
280
+ }
281
+
282
+ @media (prefers-reduced-motion: reduce) {
283
+ .chat-bubble.loading p {
284
+ animation: none;
285
+ background: none;
286
+ -webkit-text-fill-color: var(--text-muted);
287
+ color: var(--text-muted);
288
+ }
289
  }
290
 
291
  .chat-role {
292
  display: block;
293
  font-size: 11px;
294
+ color: var(--text-muted);
295
  margin-bottom: 4px;
296
  text-transform: uppercase;
297
  letter-spacing: 0.5px;
298
+ font-weight: 600;
299
  }
300
 
301
  .chat-bubble p {
 
304
 
305
  .chat-input-row {
306
  display: flex;
307
+ gap: 10px;
308
+ padding: 18px 28px;
309
+ border-top: 1px solid var(--border);
310
+ background: var(--bg);
311
  }
312
 
313
  .chat-input-row input {
314
  flex: 1;
315
+ padding: 11px 14px;
316
+ font-size: 15px;
317
  }
318
 
319
  .chat-input-row button {
320
+ background: var(--accent);
321
+ color: #ffffff;
322
  border: none;
323
+ border-radius: 10px;
324
+ padding: 10px 22px;
325
  font-size: 14px;
326
+ font-weight: 500;
327
+ font-family: var(--sans);
328
  cursor: pointer;
329
+ transition: background 0.15s;
330
  }
331
 
332
  .chat-input-row button:disabled {
333
+ opacity: 0.45;
334
  cursor: not-allowed;
335
  }
336
 
337
  .chat-input-row button:hover:not(:disabled) {
338
+ background: var(--accent-hover);
339
  }
340
 
341
  .error {
342
+ color: var(--fail);
343
  font-size: 13px;
344
  }
345
 
 
 
346
  .eval-panel {
347
+ margin-top: 10px;
348
+ border-top: 1px solid var(--border);
349
+ padding-top: 8px;
350
  }
351
 
352
  .eval-toggle {
353
  background: none;
354
  border: none;
355
+ color: var(--text-muted);
356
  font-size: 12px;
357
+ font-family: var(--sans);
358
  cursor: pointer;
359
  padding: 2px 0;
360
  display: flex;
 
363
  }
364
 
365
  .eval-toggle:hover {
366
+ color: var(--text);
367
  }
368
 
369
  .slo-badge {
370
  font-size: 10px;
371
+ padding: 2px 7px;
372
+ border-radius: 999px;
373
  font-weight: 600;
374
+ letter-spacing: 0.3px;
375
  }
376
 
377
  .slo-badge.pass {
378
+ background: var(--pass-bg);
379
+ color: var(--pass);
380
  }
381
 
382
  .slo-badge.fail {
383
+ background: var(--fail-bg);
384
+ color: var(--fail);
385
  }
386
 
387
  .eval-details {
388
  display: flex;
389
  flex-direction: column;
390
+ gap: 12px;
391
+ margin-top: 10px;
392
  }
393
 
394
  .eval-section {
 
399
 
400
  .eval-na {
401
  font-size: 11px;
402
+ color: var(--text-faint);
403
  font-style: italic;
404
  }
405
 
406
  .score-bar {
407
  height: 4px;
408
+ background: var(--border);
409
+ border-radius: 999px;
410
  overflow: hidden;
411
  }
412
 
413
  .score-bar-fill {
414
  height: 100%;
415
+ border-radius: 999px;
416
  transition: width 0.3s ease;
417
  }
418
 
 
 
419
  .star-rating {
420
  display: flex;
421
  align-items: center;
 
427
  border: none;
428
  font-size: 18px;
429
  cursor: pointer;
430
+ color: var(--border-strong);
431
  padding: 0;
432
  line-height: 1;
433
  transition: color 0.15s;
434
  }
435
 
436
  .star-rating .star.active {
437
+ color: var(--accent);
438
  }
439
 
440
  .star-rating .star:hover {
441
+ color: var(--accent-hover);
442
  }
443
 
444
  .star-label {
445
  font-size: 11px;
446
+ color: var(--text-muted);
447
  margin-left: 6px;
448
  }
449
+
450
+ ::-webkit-scrollbar {
451
+ width: 8px;
452
+ height: 8px;
453
+ }
454
+
455
+ ::-webkit-scrollbar-track {
456
+ background: transparent;
457
+ }
458
+
459
+ ::-webkit-scrollbar-thumb {
460
+ background: var(--border-strong);
461
+ border-radius: 999px;
462
+ }
463
+
464
+ ::-webkit-scrollbar-thumb:hover {
465
+ background: var(--text-faint);
466
+ }
frontend/src/components/EvalPanel.tsx CHANGED
@@ -77,7 +77,7 @@ export function EvalPanel({ evalScores }: Props) {
77
  <ScoreBar value={evalScores.groundedness} />
78
  <div className="metric-row">
79
  <span>Hallucination Rate</span>
80
- <span className={`metric-value ${evalScores.hallucination_rate > 0.2 ? "fail" : "pass"}`}>
81
  {(evalScores.hallucination_rate * 100).toFixed(0)}%
82
  </span>
83
  </div>
 
77
  <ScoreBar value={evalScores.groundedness} />
78
  <div className="metric-row">
79
  <span>Hallucination Rate</span>
80
+ <span className="metric-value">
81
  {(evalScores.hallucination_rate * 100).toFixed(0)}%
82
  </span>
83
  </div>
frontend/src/index.css CHANGED
@@ -1,111 +1,49 @@
1
  :root {
2
- --text: #6b6375;
3
- --text-h: #08060d;
4
- --bg: #fff;
5
- --border: #e5e4e7;
 
 
 
 
 
 
 
 
 
 
 
6
  --code-bg: #f4f3ec;
7
- --accent: #aa3bff;
8
- --accent-bg: rgba(170, 59, 255, 0.1);
9
- --accent-border: rgba(170, 59, 255, 0.5);
10
- --social-bg: rgba(244, 243, 236, 0.5);
11
- --shadow:
12
- rgba(0, 0, 0, 0.1) 0 10px 15px -3px, rgba(0, 0, 0, 0.05) 0 4px 6px -2px;
13
 
14
- --sans: system-ui, 'Segoe UI', Roboto, sans-serif;
15
- --heading: system-ui, 'Segoe UI', Roboto, sans-serif;
16
- --mono: ui-monospace, Consolas, monospace;
17
 
18
- font: 18px/145% var(--sans);
19
- letter-spacing: 0.18px;
20
- color-scheme: light dark;
21
  color: var(--text);
22
  background: var(--bg);
 
23
  font-synthesis: none;
24
  text-rendering: optimizeLegibility;
25
  -webkit-font-smoothing: antialiased;
26
  -moz-osx-font-smoothing: grayscale;
27
-
28
- @media (max-width: 1024px) {
29
- font-size: 16px;
30
- }
31
- }
32
-
33
- @media (prefers-color-scheme: dark) {
34
- :root {
35
- --text: #9ca3af;
36
- --text-h: #f3f4f6;
37
- --bg: #16171d;
38
- --border: #2e303a;
39
- --code-bg: #1f2028;
40
- --accent: #c084fc;
41
- --accent-bg: rgba(192, 132, 252, 0.15);
42
- --accent-border: rgba(192, 132, 252, 0.5);
43
- --social-bg: rgba(47, 48, 58, 0.5);
44
- --shadow:
45
- rgba(0, 0, 0, 0.4) 0 10px 15px -3px, rgba(0, 0, 0, 0.25) 0 4px 6px -2px;
46
- }
47
-
48
- #social .button-icon {
49
- filter: invert(1) brightness(2);
50
- }
51
- }
52
-
53
- #root {
54
- width: 1126px;
55
- max-width: 100%;
56
- margin: 0 auto;
57
- text-align: center;
58
- border-inline: 1px solid var(--border);
59
- min-height: 100svh;
60
- display: flex;
61
- flex-direction: column;
62
- box-sizing: border-box;
63
  }
64
 
65
  body {
66
  margin: 0;
67
  }
68
 
69
- h1,
70
- h2 {
71
- font-family: var(--heading);
72
- font-weight: 500;
73
- color: var(--text-h);
74
- }
75
-
76
- h1 {
77
- font-size: 56px;
78
- letter-spacing: -1.68px;
79
- margin: 32px 0;
80
- @media (max-width: 1024px) {
81
- font-size: 36px;
82
- margin: 20px 0;
83
- }
84
- }
85
- h2 {
86
- font-size: 24px;
87
- line-height: 118%;
88
- letter-spacing: -0.24px;
89
- margin: 0 0 8px;
90
- @media (max-width: 1024px) {
91
- font-size: 20px;
92
- }
93
- }
94
  p {
95
  margin: 0;
96
  }
97
 
98
- code,
99
- .counter {
100
  font-family: var(--mono);
101
- display: inline-flex;
 
102
  border-radius: 4px;
103
- color: var(--text-h);
104
- }
105
-
106
- code {
107
- font-size: 15px;
108
- line-height: 135%;
109
- padding: 4px 8px;
110
  background: var(--code-bg);
 
111
  }
 
1
  :root {
2
+ --bg: #faf9f5;
3
+ --surface: #ffffff;
4
+ --sidebar-bg: #f0eee6;
5
+ --text: #3d3929;
6
+ --text-muted: #6b6658;
7
+ --text-faint: #97918a;
8
+ --border: #e8e6dc;
9
+ --border-strong: #d4d0c4;
10
+ --accent: #c96442;
11
+ --accent-hover: #b05638;
12
+ --accent-soft: rgba(201, 100, 66, 0.15);
13
+ --pass: #2d7a4f;
14
+ --pass-bg: #e4f0e8;
15
+ --fail: #b4332a;
16
+ --fail-bg: #f5e3e0;
17
  --code-bg: #f4f3ec;
18
+ --shadow: 0 1px 2px rgba(0, 0, 0, 0.04), 0 4px 12px rgba(0, 0, 0, 0.04);
 
 
 
 
 
19
 
20
+ --sans: "Styrene B", ui-sans-serif, system-ui, -apple-system, "Segoe UI", Roboto, sans-serif;
21
+ --heading: "Copernicus", "Tiempos Text", "Iowan Old Style", Georgia, serif;
22
+ --mono: ui-monospace, "SF Mono", Menlo, Consolas, monospace;
23
 
24
+ color-scheme: light;
 
 
25
  color: var(--text);
26
  background: var(--bg);
27
+ font-family: var(--sans);
28
  font-synthesis: none;
29
  text-rendering: optimizeLegibility;
30
  -webkit-font-smoothing: antialiased;
31
  -moz-osx-font-smoothing: grayscale;
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  }
33
 
34
  body {
35
  margin: 0;
36
  }
37
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
  p {
39
  margin: 0;
40
  }
41
 
42
+ code {
 
43
  font-family: var(--mono);
44
+ font-size: 13px;
45
+ padding: 2px 6px;
46
  border-radius: 4px;
 
 
 
 
 
 
 
47
  background: var(--code-bg);
48
+ color: var(--text);
49
  }