umanggarg Claude Sonnet 4.6 commited on
Commit
6a9e16c
Β·
1 Parent(s): 43acfac

improve: tour prompt forces pipeline-first thinking + flowing arrow animation

Browse files

Prompt:
- Concept 0 is now mandatory pipeline overview (all others depend on it)
- LLM must trace the user journey before naming any concept
- Removed overly aggressive NEVER-include rule for agents/MCP
- All concepts 1-7 must have depends_on non-empty (no orphaned parallel nodes)
- Result: connected tree instead of flat parallel list

Arrows:
- Animated stroke-dashoffset (marching-ant flow direction cue)
- Default opacity 0.22 β†’ 0.45 (readable without hover)
- Highlighted arrows get drop-shadow glow
- ec-flow keyframe: 1.8s, subtle enough not to be distracting

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

backend/services/diagram_service.py CHANGED
@@ -109,92 +109,104 @@ _ENRICH_SYSTEM = (
109
  )
110
 
111
  _TOUR_SYSTEM = (
112
- "You are a senior engineer who just finished reading an unfamiliar codebase for the first time. "
113
- "Your job is to write the explanation you wish existed before you started reading β€” "
114
- "the insights that took hours to find, distilled into minutes for the next developer. "
115
- "You explain mechanisms (what would break if removed, what naive alternative was rejected, why), "
116
- "not inventory (what classes exist, what files are present). "
117
- "Every sentence must be grounded in the actual source provided. "
118
- "NEVER start with 'This repo', 'This project', or 'The codebase' β€” lead with the mechanism. "
119
  "Return ONLY valid JSON β€” no markdown fences, no explanation, just the JSON object."
120
  )
121
 
122
  _TOUR_PROMPT = """\
123
  Repository: {repo}
124
 
125
- Source code β€” read each file carefully before writing about it:
126
 
127
  {chunk_summary}
128
 
129
- Imagine you just spent an hour reading this codebase cold. Write the 6-8 things you wish
130
- someone had told you before you started β€” the decisions that surprised you, the techniques
131
- that explain why the code is shaped the way it is, the insights that made it click.
132
 
133
- ── BEFORE WRITING EACH CONCEPT YOU MUST ─────────────────────────────────────────
134
- 1. Read the actual source code for that file in the context above
135
- 2. Ask: "What would degrade or break if this were replaced with a simpler naive alternative?"
136
- 3. Ask: "What design was NOT chosen here, and what forced the actual decision?"
137
- 4. Name the TECHNIQUE or INSIGHT β€” not the class or file that implements it
138
 
139
- ── WEAK vs. STRONG ──────────────────────────────────────────────────────────────
140
 
141
- WEAK (never produce this):
142
- name: "Vector Store"
143
- description: "The class responsible for storing and retrieving vectors."
144
- Why weak: names an artifact. Any vector-DB project has this. Zero insight.
145
 
146
- STRONG (produce this):
147
- name: "Hybrid Retrieval Scoring"
148
- description: "Pure semantic search fails on rare tokens like function names and error codes.
149
- Dense ANN vectors are fused with sparse BM25 term frequencies via Reciprocal Rank Fusion β€”
150
- a parameter-free combiner that outperforms tuned weighted sums and eliminates the top
151
- failure mode of single-modality retrieval."
152
- Why strong: names the technique, states the failure mode it prevents, explains the choice.
153
 
154
- ── NEVER ────────────────────────────────────────────────────────────────────────
155
 
156
- NEVER write a concept name that matches a class name, file name, or service name in the
157
- source. That is an inventory entry, not a concept.
 
 
158
 
159
- NEVER begin a description with "The class responsible for" or "The module that handles" β€”
160
- these describe existence, not design decisions.
161
 
162
- NEVER include secondary features β€” agents, admin tools, protocol adapters, streaming
163
- bridges β€” that could be removed without breaking the repo's primary purpose.
 
164
 
165
- NEVER place more than 2 concepts at depends_on=[]. If you do, you have listed parallel
166
- features. A real learning progression has at least 3 depth levels.
 
 
 
 
167
 
168
- ── DEPENDENCY ORDER ─────────────────────────────────────────────────────────────
169
 
170
- depends_on = conceptual prerequisite: a developer cannot understand B without first
171
- understanding A. This is NOT the code import graph β€” base edges on learning order only.
 
 
172
 
173
  Return ONLY this JSON (no markdown, no extra text):
174
  {{
175
- "summary": "2 sentences: (1) what problem this repo solves and who benefits β€” be specific, name the mechanism not the category. (2) the key architectural decision that shapes everything else.",
176
- "entry_point": "filename of the most foundational concept (reading_order=1)",
177
  "concepts": [
178
  {{
179
  "id": 0,
180
- "name": "Technique or design decision (2-4 words)",
181
- "subtitle": "One sentence: the problem this concept solves",
182
- "file": "filename where this concept lives",
183
- "type": "class|function|module|algorithm",
184
- "description": "2-3 sentences written as a senior engineer explaining to a teammate: what the code actually does, what would go wrong without this approach, and the non-obvious insight that makes it work.",
185
- "key_items": ["actual_method_name_1", "actual_method_name_2"],
186
  "depends_on": [],
187
  "reading_order": 1,
188
- "ask": "The question a curious developer would ask after reading this (How, Why, What, or Explain)"
 
 
 
 
 
 
 
 
 
 
 
 
189
  }}
190
  ]
191
  }}
192
 
193
  Rules:
194
  - 6-8 concepts total
195
- - reading_order=1: depends_on=[], file matches entry_point β€” same file, always
196
- - key_items: 2-4 actual method/function names from the source above, never invented
197
- - type: exactly one of class, function, module, algorithm
198
  """
199
 
200
  _JSON_PROMPTS = {
 
109
  )
110
 
111
  _TOUR_SYSTEM = (
112
+ "You are a senior engineer who just spent an hour reading an unfamiliar codebase for the first time. "
113
+ "You think in systems β€” you trace data flows before you read individual classes. "
114
+ "Your job is to write the guided tour you wished existed before you started: "
115
+ "not a list of components, but a learning path that builds understanding layer by layer. "
116
+ "Every sentence is grounded in the actual source provided. "
 
 
117
  "Return ONLY valid JSON β€” no markdown fences, no explanation, just the JSON object."
118
  )
119
 
120
  _TOUR_PROMPT = """\
121
  Repository: {repo}
122
 
123
+ Source code β€” study the actual code before writing anything:
124
 
125
  {chunk_summary}
126
 
127
+ ── STEP 1: TRACE THE SYSTEM FLOW (do this mentally first, don't output it) ──────
 
 
128
 
129
+ Before naming any concept, answer this question in your head:
130
+ "If a user runs this repo right now, what is the sequence of files and functions
131
+ that execute from their first action to their final output?"
 
 
132
 
133
+ Write that chain mentally: UserAction β†’ file/function β†’ file/function β†’ ... β†’ Output
134
 
135
+ This chain IS the codebase. Every concept you write must explain one non-obvious
136
+ part of this chain β€” not a component that sits beside it.
137
+
138
+ ── STEP 2: CONCEPT 0 IS THE PIPELINE (always, no exceptions) ────────────────────
139
 
140
+ concept id=0 (reading_order=1, depends_on=[]) must describe the end-to-end flow
141
+ you traced above β€” what happens step by step when the system does its main job.
142
+ This is the map. Without it, every other concept is a room with no house around it.
143
+ description should name the files/stages in the sequence and explain the key split
144
+ (e.g. "index upfront, query later" or "embed, store, retrieve, generate").
 
 
145
 
146
+ ── STEP 3: EVERY OTHER CONCEPT MUST BUILD ON SOMETHING ─────────────────────────
147
 
148
+ Concepts 1-7 each answer "why is this step in the pipeline done THIS way and not
149
+ the naive way?" Each must have depends_on pointing to at least one earlier concept.
150
+ If you cannot say "a developer can't understand B without first understanding A",
151
+ the concepts should be merged.
152
 
153
+ ── WEAK vs STRONG ───────────────────────────────────────────────────────────────
 
154
 
155
+ WEAK β€” names an artifact, zero insight:
156
+ name: "Vector Store"
157
+ description: "Stores and retrieves vectors."
158
 
159
+ STRONG β€” names a decision, explains the tradeoff:
160
+ name: "Hybrid Dense+Sparse Retrieval"
161
+ description: "Semantic search alone misses rare tokens like function names.
162
+ Dense ANN vectors (meaning) are fused with BM25 sparse vectors (exact terms)
163
+ via Reciprocal Rank Fusion β€” a parameter-free combiner that beats any tuned
164
+ weighted sum and eliminates the single biggest RAG failure mode."
165
 
166
+ ── RULES ────────────────────────────────────────────────────────────────────────
167
 
168
+ - concept name must be a TECHNIQUE or DECISION, never a class/file/service name
169
+ - NEVER start a description with "The class responsible for" or "The module that"
170
+ - key_items: real method/function names visible in the source above, never invented
171
+ - At most 1 concept may have depends_on=[] (only concept 0)
172
 
173
  Return ONLY this JSON (no markdown, no extra text):
174
  {{
175
+ "summary": "2 sentences: (1) what the user can DO with this repo and what mechanism makes it work β€” name the technique, not the category. (2) the single architectural decision that shapes everything else.",
176
+ "entry_point": "file path of the pipeline entry (concept id=0)",
177
  "concepts": [
178
  {{
179
  "id": 0,
180
+ "name": "End-to-end pipeline name (3-5 words)",
181
+ "subtitle": "One sentence: what the full pipeline does for the user",
182
+ "file": "entry point file",
183
+ "type": "module",
184
+ "description": "2-3 sentences tracing the data flow: what enters, what stage transforms it and how, what the user gets out. Name the key files and the split that makes the architecture work.",
185
+ "key_items": ["function_name_1", "function_name_2"],
186
  "depends_on": [],
187
  "reading_order": 1,
188
+ "ask": "How does [key mechanism] actually work end to end?"
189
+ }},
190
+ {{
191
+ "id": 1,
192
+ "name": "Non-obvious technique name (2-4 words)",
193
+ "subtitle": "One sentence: the problem this solves in the pipeline",
194
+ "file": "file where this lives",
195
+ "type": "class|function|module|algorithm",
196
+ "description": "2-3 sentences: what the naive approach would do wrong, what this code does instead, and the non-obvious insight that makes it work.",
197
+ "key_items": ["actual_method_1", "actual_method_2"],
198
+ "depends_on": [0],
199
+ "reading_order": 2,
200
+ "ask": "Why was [naive approach] rejected in favour of this?"
201
  }}
202
  ]
203
  }}
204
 
205
  Rules:
206
  - 6-8 concepts total
207
+ - id=0 always: depends_on=[], type=module, reading_order=1, file=entry_point
208
+ - ids 1-7: depends_on must be non-empty (no orphaned parallel nodes)
209
+ - key_items: 2-4 real method/function names from the source, never invented
210
  """
211
 
212
  _JSON_PROMPTS = {
ui/src/components/ExploreView.jsx CHANGED
@@ -554,11 +554,11 @@ export default function ExploreView({ repo, onAskAbout, onRegenerateRef }) {
554
  aria-hidden="true"
555
  >
556
  <defs>
557
- {/* Default arrowhead β€” warm sienna */}
558
  <marker id="ec-arrow" markerWidth="7" markerHeight="5" refX="7" refY="2.5" orient="auto">
559
- <polygon points="0 0, 7 2.5, 0 5" fill="rgba(91,143,249,0.35)" />
560
  </marker>
561
- {/* Highlighted arrowhead */}
562
  <marker id="ec-arrow-hi" markerWidth="7" markerHeight="5" refX="7" refY="2.5" orient="auto">
563
  <polygon points="0 0, 7 2.5, 0 5" fill="#7DABFF" />
564
  </marker>
@@ -570,20 +570,25 @@ export default function ExploreView({ repo, onAskAbout, onRegenerateRef }) {
570
  const to = positions[c.id];
571
  if (!from || !to) return null;
572
 
573
- const isHi = connectedIds?.has(c.id) && connectedIds?.has(depId);
574
  const isDim = connectedIds && !isHi;
575
 
576
  return (
577
  <path
578
  key={`${depId}β†’${c.id}`}
579
  d={bezierPath(from, to)}
580
- stroke={isHi ? "#7DABFF" : "rgba(91,143,249,0.22)"}
581
- strokeWidth={isHi ? 2 : 1.5}
582
- strokeDasharray={isHi ? undefined : "none"}
 
 
 
583
  fill="none"
584
  markerEnd={isHi ? "url(#ec-arrow-hi)" : "url(#ec-arrow)"}
585
  style={{
586
- opacity: isDim ? 0.12 : 1,
 
 
587
  transition: "opacity 0.15s, stroke 0.15s, stroke-width 0.15s",
588
  }}
589
  />
 
554
  aria-hidden="true"
555
  >
556
  <defs>
557
+ {/* Default arrowhead */}
558
  <marker id="ec-arrow" markerWidth="7" markerHeight="5" refX="7" refY="2.5" orient="auto">
559
+ <polygon points="0 0, 7 2.5, 0 5" fill="rgba(91,143,249,0.55)" />
560
  </marker>
561
+ {/* Highlighted arrowhead β€” brighter, with glow colour */}
562
  <marker id="ec-arrow-hi" markerWidth="7" markerHeight="5" refX="7" refY="2.5" orient="auto">
563
  <polygon points="0 0, 7 2.5, 0 5" fill="#7DABFF" />
564
  </marker>
 
570
  const to = positions[c.id];
571
  if (!from || !to) return null;
572
 
573
+ const isHi = connectedIds?.has(c.id) && connectedIds?.has(depId);
574
  const isDim = connectedIds && !isHi;
575
 
576
  return (
577
  <path
578
  key={`${depId}β†’${c.id}`}
579
  d={bezierPath(from, to)}
580
+ stroke={isHi ? "#7DABFF" : "rgba(91,143,249,0.45)"}
581
+ strokeWidth={isHi ? 2.5 : 1.5}
582
+ // dasharray 12 6: the marching-ant animation advances the offset by
583
+ // one period (18px) per cycle, making the dashes appear to travel
584
+ // from source to target β€” a directional flow cue without extra UI.
585
+ strokeDasharray="12 6"
586
  fill="none"
587
  markerEnd={isHi ? "url(#ec-arrow-hi)" : "url(#ec-arrow)"}
588
  style={{
589
+ opacity: isDim ? 0.1 : 1,
590
+ animation: "ec-flow 1.8s linear infinite",
591
+ filter: isHi ? "drop-shadow(0 0 4px rgba(125,171,255,0.7))" : undefined,
592
  transition: "opacity 0.15s, stroke 0.15s, stroke-width 0.15s",
593
  }}
594
  />
ui/src/index.css CHANGED
@@ -3419,6 +3419,14 @@ textarea:focus-visible {
3419
  100% { opacity: 0; box-shadow: none; }
3420
  }
3421
 
 
 
 
 
 
 
 
 
3422
  /* ══════════════════════════════════════════════════════════
3423
  FEATURE 3: SESSION SEARCH + EDITABLE TITLE
3424
  ══════════════════════════════════════════════════════════ */
 
3419
  100% { opacity: 0; box-shadow: none; }
3420
  }
3421
 
3422
+ /* ── Concept tour arrow flow animation ───────────────────────────────────── */
3423
+ /* stroke-dashoffset marching toward the target creates a directional "flow" cue.
3424
+ dasharray=12 6 means 12px dash, 6px gap β€” the offset advances by one period (18px)
3425
+ per cycle so the pattern appears to travel forward along the path. */
3426
+ @keyframes ec-flow {
3427
+ to { stroke-dashoffset: -18; }
3428
+ }
3429
+
3430
  /* ══════════════════════════════════════════════════════════
3431
  FEATURE 3: SESSION SEARCH + EDITABLE TITLE
3432
  ══════════════════════════════════════════════════════════ */