ritishshrirao commited on
Commit
9e6be29
·
1 Parent(s): d6fbf54

Add fixed seeds for dataset, update dashboard

Browse files
artifacts/osint_dashboard.html CHANGED
The diff for this file is too large to render. See raw diff
 
datasets/fixed_levels/README.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Fixed Levels Submission Dataset
2
+
3
+ This folder contains a fixed three-level OSINT benchmark set built on one shared base graph.
4
+
5
+ ## Files
6
+
7
+ - `seed_fixed_levels.json`: master fixed seed with canonical nodes, edges, and 15 fixed questions.
8
+ - `fixed_graph_questions.json`: extracted fixed dataset snapshot for submission packaging.
9
+ - `shared_config_fixed_levels.json`: run config used for generation and evaluation.
10
+ - `complete_dataset_qwen_generated.json`: full dataset after Qwen (`qwen3:2b` via Ollama) expands the graph.
11
+ - `qwen_swarm_eval_fixed_levels.json`: Qwen swarm evaluation summary on this set.
12
+ - `qwen_swarm_benchmark_fixed_levels.json`: benchmark output with record and summary.
13
+ - `leaderboard_fixed_levels.json`: leaderboard file for this dataset.
14
+ - `dashboard_fixed_levels.html`: interactive dashboard generated from the benchmark run.
15
+
16
+ ## Difficulty Design
17
+
18
+ - Easy: 5 questions, mostly direct alias, org, location, and event lookup.
19
+ - Mid: 5 questions, 2-hop linking across alias plus org or event relations.
20
+ - High: 5 questions, multi-hop cross-platform traces with implicit collaboration context.
21
+
22
+ All 15 questions are fixed and share the same seeded graph.
23
+
24
+ ## Regenerate Artifacts
25
+
26
+ ```bash
27
+ source ~/arl/bin/activate
28
+ cd /home/ritish/test1
29
+ PYTHONPATH=src python scripts/build_fixed_levels_dataset.py \
30
+ --seed-file datasets/fixed_levels/seed_fixed_levels.json \
31
+ --shared-config datasets/fixed_levels/shared_config_fixed_levels.json \
32
+ --output-dir datasets/fixed_levels
33
+ ```
34
+
35
+ ## Evaluate Qwen Swarm
36
+
37
+ ```bash
38
+ source ~/arl/bin/activate
39
+ cd /home/ritish/test1
40
+ PYTHONPATH=src osint-env eval \
41
+ --config datasets/fixed_levels/shared_config_fixed_levels.json \
42
+ --seed-file datasets/fixed_levels/seed_fixed_levels.json \
43
+ --agent-mode swarm \
44
+ --llm-provider ollama \
45
+ --llm-model qwen3:2b \
46
+ --episodes 15
47
+ ```
48
+
49
+ ## Benchmark + Dashboard
50
+
51
+ ```bash
52
+ source ~/arl/bin/activate
53
+ cd /home/ritish/test1
54
+ PYTHONPATH=src osint-env benchmark \
55
+ --config datasets/fixed_levels/shared_config_fixed_levels.json \
56
+ --seed-file datasets/fixed_levels/seed_fixed_levels.json \
57
+ --agent-mode swarm \
58
+ --llm-provider ollama \
59
+ --llm-model qwen3:2b \
60
+ --episodes 15 \
61
+ --name fixed_levels_qwen_swarm \
62
+ --leaderboard datasets/fixed_levels/leaderboard_fixed_levels.json \
63
+ --dashboard datasets/fixed_levels/dashboard_fixed_levels.html
64
+ ```
datasets/fixed_levels/complete_dataset_qwen_generated.json ADDED
@@ -0,0 +1,1930 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "canonical_graph": {
3
+ "edge_count": 109,
4
+ "edges": [
5
+ {
6
+ "confidence": 1.0,
7
+ "dst": "org_apex_dynamics",
8
+ "rel": "works_at",
9
+ "src": "user_0"
10
+ },
11
+ {
12
+ "confidence": 1.0,
13
+ "dst": "loc_hyderabad",
14
+ "rel": "located_in",
15
+ "src": "user_0"
16
+ },
17
+ {
18
+ "confidence": 1.0,
19
+ "dst": "org_northbridge",
20
+ "rel": "works_at",
21
+ "src": "user_1"
22
+ },
23
+ {
24
+ "confidence": 1.0,
25
+ "dst": "loc_bengaluru",
26
+ "rel": "located_in",
27
+ "src": "user_1"
28
+ },
29
+ {
30
+ "confidence": 1.0,
31
+ "dst": "org_northbridge",
32
+ "rel": "works_at",
33
+ "src": "user_2"
34
+ },
35
+ {
36
+ "confidence": 1.0,
37
+ "dst": "loc_delhi",
38
+ "rel": "located_in",
39
+ "src": "user_2"
40
+ },
41
+ {
42
+ "confidence": 1.0,
43
+ "dst": "org_northbridge",
44
+ "rel": "works_at",
45
+ "src": "user_3"
46
+ },
47
+ {
48
+ "confidence": 1.0,
49
+ "dst": "loc_delhi",
50
+ "rel": "located_in",
51
+ "src": "user_3"
52
+ },
53
+ {
54
+ "confidence": 0.8,
55
+ "dst": "user_0",
56
+ "rel": "connected_to",
57
+ "src": "user_3"
58
+ },
59
+ {
60
+ "confidence": 0.8,
61
+ "dst": "user_2",
62
+ "rel": "connected_to",
63
+ "src": "user_0"
64
+ },
65
+ {
66
+ "confidence": 1.0,
67
+ "dst": "user_ivy",
68
+ "rel": "alias_of",
69
+ "src": "alias_orchidfox"
70
+ },
71
+ {
72
+ "confidence": 1.0,
73
+ "dst": "user_bharat",
74
+ "rel": "alias_of",
75
+ "src": "alias_steelquill"
76
+ },
77
+ {
78
+ "confidence": 1.0,
79
+ "dst": "user_diya",
80
+ "rel": "alias_of",
81
+ "src": "alias_monsoonbyte"
82
+ },
83
+ {
84
+ "confidence": 1.0,
85
+ "dst": "user_faris",
86
+ "rel": "alias_of",
87
+ "src": "alias_nightrelay"
88
+ },
89
+ {
90
+ "confidence": 1.0,
91
+ "dst": "user_elin",
92
+ "rel": "alias_of",
93
+ "src": "alias_mapleghost"
94
+ },
95
+ {
96
+ "confidence": 1.0,
97
+ "dst": "user_hiro",
98
+ "rel": "alias_of",
99
+ "src": "alias_docksparrow"
100
+ },
101
+ {
102
+ "confidence": 1.0,
103
+ "dst": "user_cyrus",
104
+ "rel": "alias_of",
105
+ "src": "alias_quartzlotus"
106
+ },
107
+ {
108
+ "confidence": 1.0,
109
+ "dst": "org_helios_labs",
110
+ "rel": "works_at",
111
+ "src": "user_aria"
112
+ },
113
+ {
114
+ "confidence": 1.0,
115
+ "dst": "org_northbridge_logistics",
116
+ "rel": "works_at",
117
+ "src": "user_bharat"
118
+ },
119
+ {
120
+ "confidence": 1.0,
121
+ "dst": "org_apex_dynamics",
122
+ "rel": "works_at",
123
+ "src": "user_cyrus"
124
+ },
125
+ {
126
+ "confidence": 1.0,
127
+ "dst": "org_blueharbor_media",
128
+ "rel": "works_at",
129
+ "src": "user_diya"
130
+ },
131
+ {
132
+ "confidence": 1.0,
133
+ "dst": "org_helios_labs",
134
+ "rel": "works_at",
135
+ "src": "user_elin"
136
+ },
137
+ {
138
+ "confidence": 1.0,
139
+ "dst": "org_tidewatch_ops",
140
+ "rel": "works_at",
141
+ "src": "user_faris"
142
+ },
143
+ {
144
+ "confidence": 1.0,
145
+ "dst": "org_apex_dynamics",
146
+ "rel": "works_at",
147
+ "src": "user_gita"
148
+ },
149
+ {
150
+ "confidence": 1.0,
151
+ "dst": "org_northbridge_logistics",
152
+ "rel": "works_at",
153
+ "src": "user_hiro"
154
+ },
155
+ {
156
+ "confidence": 1.0,
157
+ "dst": "org_kestrel_works",
158
+ "rel": "works_at",
159
+ "src": "user_ivy"
160
+ },
161
+ {
162
+ "confidence": 1.0,
163
+ "dst": "org_blueharbor_media",
164
+ "rel": "works_at",
165
+ "src": "user_jules"
166
+ },
167
+ {
168
+ "confidence": 1.0,
169
+ "dst": "loc_sector9",
170
+ "rel": "located_in",
171
+ "src": "user_aria"
172
+ },
173
+ {
174
+ "confidence": 1.0,
175
+ "dst": "loc_dockyard17",
176
+ "rel": "located_in",
177
+ "src": "user_bharat"
178
+ },
179
+ {
180
+ "confidence": 1.0,
181
+ "dst": "loc_old_town",
182
+ "rel": "located_in",
183
+ "src": "user_cyrus"
184
+ },
185
+ {
186
+ "confidence": 1.0,
187
+ "dst": "loc_old_town",
188
+ "rel": "located_in",
189
+ "src": "user_diya"
190
+ },
191
+ {
192
+ "confidence": 1.0,
193
+ "dst": "loc_sector9",
194
+ "rel": "located_in",
195
+ "src": "user_elin"
196
+ },
197
+ {
198
+ "confidence": 1.0,
199
+ "dst": "loc_rivergate",
200
+ "rel": "located_in",
201
+ "src": "user_faris"
202
+ },
203
+ {
204
+ "confidence": 1.0,
205
+ "dst": "loc_old_town",
206
+ "rel": "located_in",
207
+ "src": "user_gita"
208
+ },
209
+ {
210
+ "confidence": 1.0,
211
+ "dst": "loc_dockyard17",
212
+ "rel": "located_in",
213
+ "src": "user_hiro"
214
+ },
215
+ {
216
+ "confidence": 1.0,
217
+ "dst": "loc_rivergate",
218
+ "rel": "located_in",
219
+ "src": "user_ivy"
220
+ },
221
+ {
222
+ "confidence": 1.0,
223
+ "dst": "loc_old_town",
224
+ "rel": "located_in",
225
+ "src": "user_jules"
226
+ },
227
+ {
228
+ "confidence": 1.0,
229
+ "dst": "loc_sector9",
230
+ "rel": "operates_in",
231
+ "src": "org_helios_labs"
232
+ },
233
+ {
234
+ "confidence": 1.0,
235
+ "dst": "loc_dockyard17",
236
+ "rel": "operates_in",
237
+ "src": "org_northbridge_logistics"
238
+ },
239
+ {
240
+ "confidence": 1.0,
241
+ "dst": "loc_old_town",
242
+ "rel": "operates_in",
243
+ "src": "org_apex_dynamics"
244
+ },
245
+ {
246
+ "confidence": 1.0,
247
+ "dst": "loc_old_town",
248
+ "rel": "operates_in",
249
+ "src": "org_blueharbor_media"
250
+ },
251
+ {
252
+ "confidence": 1.0,
253
+ "dst": "loc_rivergate",
254
+ "rel": "operates_in",
255
+ "src": "org_tidewatch_ops"
256
+ },
257
+ {
258
+ "confidence": 1.0,
259
+ "dst": "loc_rivergate",
260
+ "rel": "operates_in",
261
+ "src": "org_kestrel_works"
262
+ },
263
+ {
264
+ "confidence": 0.95,
265
+ "dst": "user_bharat",
266
+ "rel": "connected_to",
267
+ "src": "user_ivy"
268
+ },
269
+ {
270
+ "confidence": 0.95,
271
+ "dst": "user_hiro",
272
+ "rel": "connected_to",
273
+ "src": "user_bharat"
274
+ },
275
+ {
276
+ "confidence": 0.9,
277
+ "dst": "user_faris",
278
+ "rel": "connected_to",
279
+ "src": "user_hiro"
280
+ },
281
+ {
282
+ "confidence": 0.9,
283
+ "dst": "user_diya",
284
+ "rel": "connected_to",
285
+ "src": "user_faris"
286
+ },
287
+ {
288
+ "confidence": 0.88,
289
+ "dst": "user_elin",
290
+ "rel": "connected_to",
291
+ "src": "user_diya"
292
+ },
293
+ {
294
+ "confidence": 0.85,
295
+ "dst": "user_aria",
296
+ "rel": "connected_to",
297
+ "src": "user_elin"
298
+ },
299
+ {
300
+ "confidence": 0.82,
301
+ "dst": "user_cyrus",
302
+ "rel": "connected_to",
303
+ "src": "user_aria"
304
+ },
305
+ {
306
+ "confidence": 0.82,
307
+ "dst": "user_gita",
308
+ "rel": "connected_to",
309
+ "src": "user_cyrus"
310
+ },
311
+ {
312
+ "confidence": 0.8,
313
+ "dst": "user_jules",
314
+ "rel": "connected_to",
315
+ "src": "user_gita"
316
+ },
317
+ {
318
+ "confidence": 0.8,
319
+ "dst": "user_bharat",
320
+ "rel": "connected_to",
321
+ "src": "user_jules"
322
+ },
323
+ {
324
+ "confidence": 0.9,
325
+ "dst": "user_ivy",
326
+ "rel": "connected_to",
327
+ "src": "user_diya"
328
+ },
329
+ {
330
+ "confidence": 0.86,
331
+ "dst": "user_elin",
332
+ "rel": "connected_to",
333
+ "src": "user_ivy"
334
+ },
335
+ {
336
+ "confidence": 1.0,
337
+ "dst": "post_midnight_manifest",
338
+ "rel": "authored_post",
339
+ "src": "alias_orchidfox"
340
+ },
341
+ {
342
+ "confidence": 1.0,
343
+ "dst": "post_shift_roster",
344
+ "rel": "authored_post",
345
+ "src": "alias_docksparrow"
346
+ },
347
+ {
348
+ "confidence": 1.0,
349
+ "dst": "post_sat_phone_ping",
350
+ "rel": "authored_post",
351
+ "src": "alias_nightrelay"
352
+ },
353
+ {
354
+ "confidence": 1.0,
355
+ "dst": "post_drone_parts",
356
+ "rel": "authored_post",
357
+ "src": "alias_monsoonbyte"
358
+ },
359
+ {
360
+ "confidence": 1.0,
361
+ "dst": "post_relay_schedule",
362
+ "rel": "authored_post",
363
+ "src": "alias_steelquill"
364
+ },
365
+ {
366
+ "confidence": 1.0,
367
+ "dst": "loc_dockyard17",
368
+ "rel": "references",
369
+ "src": "post_midnight_manifest"
370
+ },
371
+ {
372
+ "confidence": 1.0,
373
+ "dst": "event_project_lantern",
374
+ "rel": "references",
375
+ "src": "post_midnight_manifest"
376
+ },
377
+ {
378
+ "confidence": 1.0,
379
+ "dst": "loc_dockyard17",
380
+ "rel": "references",
381
+ "src": "post_shift_roster"
382
+ },
383
+ {
384
+ "confidence": 1.0,
385
+ "dst": "loc_rivergate",
386
+ "rel": "references",
387
+ "src": "post_sat_phone_ping"
388
+ },
389
+ {
390
+ "confidence": 1.0,
391
+ "dst": "event_black_kite",
392
+ "rel": "references",
393
+ "src": "post_drone_parts"
394
+ },
395
+ {
396
+ "confidence": 1.0,
397
+ "dst": "event_project_lantern",
398
+ "rel": "references",
399
+ "src": "post_relay_schedule"
400
+ },
401
+ {
402
+ "confidence": 1.0,
403
+ "dst": "thr_supply_leak",
404
+ "rel": "authored_thread",
405
+ "src": "user_diya"
406
+ },
407
+ {
408
+ "confidence": 1.0,
409
+ "dst": "thr_port_audit",
410
+ "rel": "authored_thread",
411
+ "src": "user_jules"
412
+ },
413
+ {
414
+ "confidence": 1.0,
415
+ "dst": "event_project_lantern",
416
+ "rel": "discusses",
417
+ "src": "thr_supply_leak"
418
+ },
419
+ {
420
+ "confidence": 1.0,
421
+ "dst": "org_northbridge_logistics",
422
+ "rel": "references",
423
+ "src": "thr_supply_leak"
424
+ },
425
+ {
426
+ "confidence": 1.0,
427
+ "dst": "event_black_kite",
428
+ "rel": "discusses",
429
+ "src": "thr_port_audit"
430
+ },
431
+ {
432
+ "confidence": 1.0,
433
+ "dst": "org_kestrel_works",
434
+ "rel": "references",
435
+ "src": "thr_port_audit"
436
+ },
437
+ {
438
+ "confidence": 0.95,
439
+ "dst": "event_project_lantern",
440
+ "rel": "collaborates_on",
441
+ "src": "user_bharat"
442
+ },
443
+ {
444
+ "confidence": 0.95,
445
+ "dst": "event_project_lantern",
446
+ "rel": "collaborates_on",
447
+ "src": "user_hiro"
448
+ },
449
+ {
450
+ "confidence": 0.9,
451
+ "dst": "event_project_lantern",
452
+ "rel": "collaborates_on",
453
+ "src": "user_faris"
454
+ },
455
+ {
456
+ "confidence": 0.9,
457
+ "dst": "event_project_lantern",
458
+ "rel": "investigates",
459
+ "src": "user_diya"
460
+ },
461
+ {
462
+ "confidence": 0.94,
463
+ "dst": "event_black_kite",
464
+ "rel": "collaborates_on",
465
+ "src": "user_ivy"
466
+ },
467
+ {
468
+ "confidence": 0.9,
469
+ "dst": "event_black_kite",
470
+ "rel": "collaborates_on",
471
+ "src": "user_cyrus"
472
+ },
473
+ {
474
+ "confidence": 0.88,
475
+ "dst": "event_black_kite",
476
+ "rel": "investigates",
477
+ "src": "user_elin"
478
+ },
479
+ {
480
+ "confidence": 0.86,
481
+ "dst": "event_silent_current",
482
+ "rel": "monitors",
483
+ "src": "user_gita"
484
+ },
485
+ {
486
+ "confidence": 0.86,
487
+ "dst": "event_silent_current",
488
+ "rel": "reports_on",
489
+ "src": "user_jules"
490
+ },
491
+ {
492
+ "confidence": 0.8,
493
+ "dst": "event_black_kite",
494
+ "rel": "connected_to",
495
+ "src": "user_ivy"
496
+ },
497
+ {
498
+ "confidence": 0.8,
499
+ "dst": "event_project_lantern",
500
+ "rel": "connected_to",
501
+ "src": "user_bharat"
502
+ },
503
+ {
504
+ "confidence": 0.8,
505
+ "dst": "event_project_lantern",
506
+ "rel": "connected_to",
507
+ "src": "user_hiro"
508
+ },
509
+ {
510
+ "confidence": 0.8,
511
+ "dst": "event_project_lantern",
512
+ "rel": "connected_to",
513
+ "src": "user_faris"
514
+ },
515
+ {
516
+ "confidence": 0.8,
517
+ "dst": "event_project_lantern",
518
+ "rel": "connected_to",
519
+ "src": "user_diya"
520
+ },
521
+ {
522
+ "confidence": 0.8,
523
+ "dst": "event_black_kite",
524
+ "rel": "connected_to",
525
+ "src": "user_elin"
526
+ },
527
+ {
528
+ "confidence": 0.8,
529
+ "dst": "event_silent_current",
530
+ "rel": "connected_to",
531
+ "src": "user_gita"
532
+ },
533
+ {
534
+ "confidence": 0.5,
535
+ "dst": "loc_rivergate",
536
+ "rel": "connected_to",
537
+ "src": "loc_old_town"
538
+ },
539
+ {
540
+ "confidence": 0.5,
541
+ "dst": "event_silent_current",
542
+ "rel": "connected_to",
543
+ "src": "loc_old_town"
544
+ },
545
+ {
546
+ "confidence": 0.5,
547
+ "dst": "event_project_lantern",
548
+ "rel": "connected_to",
549
+ "src": "loc_dockyard17"
550
+ },
551
+ {
552
+ "confidence": 0.5,
553
+ "dst": "event_black_kite",
554
+ "rel": "connected_to",
555
+ "src": "loc_sector9"
556
+ },
557
+ {
558
+ "confidence": 0.5,
559
+ "dst": "loc_dockyard17",
560
+ "rel": "connected_to",
561
+ "src": "loc_hyderabad"
562
+ },
563
+ {
564
+ "confidence": 0.9,
565
+ "dst": "org_tidewatch_ops",
566
+ "rel": "operates_in",
567
+ "src": "loc_rivergate"
568
+ },
569
+ {
570
+ "confidence": 0.9,
571
+ "dst": "org_northbridge_logistics",
572
+ "rel": "operates_in",
573
+ "src": "loc_dockyard17"
574
+ },
575
+ {
576
+ "confidence": 0.9,
577
+ "dst": "org_apex_dynamics",
578
+ "rel": "operates_in",
579
+ "src": "loc_old_town"
580
+ },
581
+ {
582
+ "confidence": 0.9,
583
+ "dst": "org_helios_labs",
584
+ "rel": "operates_in",
585
+ "src": "loc_sector9"
586
+ },
587
+ {
588
+ "confidence": 0.9,
589
+ "dst": "user_0",
590
+ "rel": "located_in",
591
+ "src": "loc_hyderabad"
592
+ },
593
+ {
594
+ "confidence": 0.9,
595
+ "dst": "user_faris",
596
+ "rel": "located_in",
597
+ "src": "loc_rivergate"
598
+ },
599
+ {
600
+ "confidence": 0.9,
601
+ "dst": "user_bharat",
602
+ "rel": "located_in",
603
+ "src": "loc_dockyard17"
604
+ },
605
+ {
606
+ "confidence": 0.8,
607
+ "dst": "loc_bengaluru",
608
+ "rel": "to",
609
+ "src": "loc_hyderabad"
610
+ },
611
+ {
612
+ "confidence": 0.8,
613
+ "dst": "loc_dockyard17",
614
+ "rel": "to",
615
+ "src": "post_midnight_manifest"
616
+ },
617
+ {
618
+ "confidence": 0.8,
619
+ "dst": "event_project_lantern",
620
+ "rel": "to",
621
+ "src": "post_midnight_manifest"
622
+ },
623
+ {
624
+ "confidence": 0.8,
625
+ "dst": "loc_dockyard17",
626
+ "rel": "to",
627
+ "src": "post_shift_roster"
628
+ },
629
+ {
630
+ "confidence": 0.8,
631
+ "dst": "loc_rivergate",
632
+ "rel": "to",
633
+ "src": "post_sat_phone_ping"
634
+ },
635
+ {
636
+ "confidence": 0.8,
637
+ "dst": "event_black_kite",
638
+ "rel": "to",
639
+ "src": "post_drone_parts"
640
+ },
641
+ {
642
+ "confidence": 0.8,
643
+ "dst": "event_project_lantern",
644
+ "rel": "to",
645
+ "src": "post_relay_schedule"
646
+ },
647
+ {
648
+ "confidence": 0.9,
649
+ "dst": "thr_supply_leak",
650
+ "rel": "to",
651
+ "src": "user_diya"
652
+ },
653
+ {
654
+ "confidence": 0.9,
655
+ "dst": "thr_port_audit",
656
+ "rel": "to",
657
+ "src": "user_jules"
658
+ }
659
+ ],
660
+ "node_count": 45,
661
+ "nodes": [
662
+ {
663
+ "attrs": {
664
+ "handle": "@docksparrow"
665
+ },
666
+ "node_id": "alias_docksparrow",
667
+ "node_type": "alias"
668
+ },
669
+ {
670
+ "attrs": {
671
+ "handle": "@mapleghost"
672
+ },
673
+ "node_id": "alias_mapleghost",
674
+ "node_type": "alias"
675
+ },
676
+ {
677
+ "attrs": {
678
+ "handle": "@monsoonbyte"
679
+ },
680
+ "node_id": "alias_monsoonbyte",
681
+ "node_type": "alias"
682
+ },
683
+ {
684
+ "attrs": {
685
+ "handle": "@nightrelay"
686
+ },
687
+ "node_id": "alias_nightrelay",
688
+ "node_type": "alias"
689
+ },
690
+ {
691
+ "attrs": {
692
+ "handle": "@orchidfox"
693
+ },
694
+ "node_id": "alias_orchidfox",
695
+ "node_type": "alias"
696
+ },
697
+ {
698
+ "attrs": {
699
+ "handle": "@quartzlotus"
700
+ },
701
+ "node_id": "alias_quartzlotus",
702
+ "node_type": "alias"
703
+ },
704
+ {
705
+ "attrs": {
706
+ "handle": "@steelquill"
707
+ },
708
+ "node_id": "alias_steelquill",
709
+ "node_type": "alias"
710
+ },
711
+ {
712
+ "attrs": {
713
+ "name": "Black Kite"
714
+ },
715
+ "node_id": "event_black_kite",
716
+ "node_type": "event"
717
+ },
718
+ {
719
+ "attrs": {
720
+ "name": "Project Lantern"
721
+ },
722
+ "node_id": "event_project_lantern",
723
+ "node_type": "event"
724
+ },
725
+ {
726
+ "attrs": {
727
+ "name": "Silent Current"
728
+ },
729
+ "node_id": "event_silent_current",
730
+ "node_type": "event"
731
+ },
732
+ {
733
+ "attrs": {
734
+ "name": "Bengaluru"
735
+ },
736
+ "node_id": "loc_bengaluru",
737
+ "node_type": "location"
738
+ },
739
+ {
740
+ "attrs": {
741
+ "name": "Delhi"
742
+ },
743
+ "node_id": "loc_delhi",
744
+ "node_type": "location"
745
+ },
746
+ {
747
+ "attrs": {
748
+ "name": "Dockyard 17"
749
+ },
750
+ "node_id": "loc_dockyard17",
751
+ "node_type": "location"
752
+ },
753
+ {
754
+ "attrs": {
755
+ "name": "Hyderabad"
756
+ },
757
+ "node_id": "loc_hyderabad",
758
+ "node_type": "location"
759
+ },
760
+ {
761
+ "attrs": {
762
+ "name": "Old Town"
763
+ },
764
+ "node_id": "loc_old_town",
765
+ "node_type": "location"
766
+ },
767
+ {
768
+ "attrs": {
769
+ "name": "Rivergate"
770
+ },
771
+ "node_id": "loc_rivergate",
772
+ "node_type": "location"
773
+ },
774
+ {
775
+ "attrs": {
776
+ "name": "Sector 9"
777
+ },
778
+ "node_id": "loc_sector9",
779
+ "node_type": "location"
780
+ },
781
+ {
782
+ "attrs": {
783
+ "name": "Apex Dynamics"
784
+ },
785
+ "node_id": "org_apex_dynamics",
786
+ "node_type": "org"
787
+ },
788
+ {
789
+ "attrs": {
790
+ "name": "Blueharbor Media"
791
+ },
792
+ "node_id": "org_blueharbor_media",
793
+ "node_type": "org"
794
+ },
795
+ {
796
+ "attrs": {
797
+ "name": "Helios Labs"
798
+ },
799
+ "node_id": "org_helios_labs",
800
+ "node_type": "org"
801
+ },
802
+ {
803
+ "attrs": {
804
+ "name": "Kestrel Works"
805
+ },
806
+ "node_id": "org_kestrel_works",
807
+ "node_type": "org"
808
+ },
809
+ {
810
+ "attrs": {
811
+ "name": "Northbridge"
812
+ },
813
+ "node_id": "org_northbridge",
814
+ "node_type": "org"
815
+ },
816
+ {
817
+ "attrs": {
818
+ "name": "Northbridge Logistics"
819
+ },
820
+ "node_id": "org_northbridge_logistics",
821
+ "node_type": "org"
822
+ },
823
+ {
824
+ "attrs": {
825
+ "name": "Tidewatch Ops"
826
+ },
827
+ "node_id": "org_tidewatch_ops",
828
+ "node_type": "org"
829
+ },
830
+ {
831
+ "attrs": {
832
+ "channel": "microblog"
833
+ },
834
+ "node_id": "post_drone_parts",
835
+ "node_type": "post"
836
+ },
837
+ {
838
+ "attrs": {
839
+ "channel": "microblog"
840
+ },
841
+ "node_id": "post_midnight_manifest",
842
+ "node_type": "post"
843
+ },
844
+ {
845
+ "attrs": {
846
+ "channel": "microblog"
847
+ },
848
+ "node_id": "post_relay_schedule",
849
+ "node_type": "post"
850
+ },
851
+ {
852
+ "attrs": {
853
+ "channel": "microblog"
854
+ },
855
+ "node_id": "post_sat_phone_ping",
856
+ "node_type": "post"
857
+ },
858
+ {
859
+ "attrs": {
860
+ "channel": "microblog"
861
+ },
862
+ "node_id": "post_shift_roster",
863
+ "node_type": "post"
864
+ },
865
+ {
866
+ "attrs": {
867
+ "topic": "port_audit"
868
+ },
869
+ "node_id": "thr_port_audit",
870
+ "node_type": "thread"
871
+ },
872
+ {
873
+ "attrs": {
874
+ "topic": "supply_chain"
875
+ },
876
+ "node_id": "thr_supply_leak",
877
+ "node_type": "thread"
878
+ },
879
+ {
880
+ "attrs": {
881
+ "location": "Hyderabad",
882
+ "name": "Person 0",
883
+ "org": "Apex Dynamics"
884
+ },
885
+ "node_id": "user_0",
886
+ "node_type": "user"
887
+ },
888
+ {
889
+ "attrs": {
890
+ "location": "Bengaluru",
891
+ "name": "Person 1",
892
+ "org": "Northbridge"
893
+ },
894
+ "node_id": "user_1",
895
+ "node_type": "user"
896
+ },
897
+ {
898
+ "attrs": {
899
+ "location": "Delhi",
900
+ "name": "Person 2",
901
+ "org": "Northbridge"
902
+ },
903
+ "node_id": "user_2",
904
+ "node_type": "user"
905
+ },
906
+ {
907
+ "attrs": {
908
+ "location": "Delhi",
909
+ "name": "Person 3",
910
+ "org": "Northbridge"
911
+ },
912
+ "node_id": "user_3",
913
+ "node_type": "user"
914
+ },
915
+ {
916
+ "attrs": {
917
+ "location": "Sector 9",
918
+ "name": "Aria Sen",
919
+ "org": "Helios Labs"
920
+ },
921
+ "node_id": "user_aria",
922
+ "node_type": "user"
923
+ },
924
+ {
925
+ "attrs": {
926
+ "location": "Dockyard 17",
927
+ "name": "Bharat Kulkarni",
928
+ "org": "Northbridge Logistics"
929
+ },
930
+ "node_id": "user_bharat",
931
+ "node_type": "user"
932
+ },
933
+ {
934
+ "attrs": {
935
+ "location": "Old Town",
936
+ "name": "Cyrus Mehta",
937
+ "org": "Apex Dynamics"
938
+ },
939
+ "node_id": "user_cyrus",
940
+ "node_type": "user"
941
+ },
942
+ {
943
+ "attrs": {
944
+ "location": "Old Town",
945
+ "name": "Diya Roy",
946
+ "org": "Blueharbor Media"
947
+ },
948
+ "node_id": "user_diya",
949
+ "node_type": "user"
950
+ },
951
+ {
952
+ "attrs": {
953
+ "location": "Sector 9",
954
+ "name": "Elin Das",
955
+ "org": "Helios Labs"
956
+ },
957
+ "node_id": "user_elin",
958
+ "node_type": "user"
959
+ },
960
+ {
961
+ "attrs": {
962
+ "location": "Rivergate",
963
+ "name": "Faris Noor",
964
+ "org": "Tidewatch Ops"
965
+ },
966
+ "node_id": "user_faris",
967
+ "node_type": "user"
968
+ },
969
+ {
970
+ "attrs": {
971
+ "location": "Old Town",
972
+ "name": "Gita Pradhan",
973
+ "org": "Apex Dynamics"
974
+ },
975
+ "node_id": "user_gita",
976
+ "node_type": "user"
977
+ },
978
+ {
979
+ "attrs": {
980
+ "location": "Dockyard 17",
981
+ "name": "Hiro Tan",
982
+ "org": "Northbridge Logistics"
983
+ },
984
+ "node_id": "user_hiro",
985
+ "node_type": "user"
986
+ },
987
+ {
988
+ "attrs": {
989
+ "location": "Rivergate",
990
+ "name": "Ivy Kapoor",
991
+ "org": "Kestrel Works"
992
+ },
993
+ "node_id": "user_ivy",
994
+ "node_type": "user"
995
+ },
996
+ {
997
+ "attrs": {
998
+ "location": "Old Town",
999
+ "name": "Jules Banerjee",
1000
+ "org": "Blueharbor Media"
1001
+ },
1002
+ "node_id": "user_jules",
1003
+ "node_type": "user"
1004
+ }
1005
+ ]
1006
+ },
1007
+ "dataset_name": "fixed_levels_submission_set",
1008
+ "difficulty_counts": {
1009
+ "easy": 5,
1010
+ "high": 5,
1011
+ "mid": 5
1012
+ },
1013
+ "environment": {
1014
+ "alias_density": 0.0,
1015
+ "n_users": 4,
1016
+ "noise_level": 0.08,
1017
+ "red_herring_rate": 0.04,
1018
+ "seed": 2026
1019
+ },
1020
+ "generation_mode": "llm_expanded",
1021
+ "llm": {
1022
+ "max_tokens": 384,
1023
+ "model": "qwen3:2b",
1024
+ "ollama_base_url": "http://127.0.0.1:11434",
1025
+ "openai_api_key": "",
1026
+ "openai_api_key_env": "OPENAI_API_KEY",
1027
+ "openai_base_url": "https://api.openai.com/v1",
1028
+ "provider": "ollama",
1029
+ "temperature": 0.05,
1030
+ "timeout_seconds": 240
1031
+ },
1032
+ "platform_views": {
1033
+ "counts": {
1034
+ "forum_threads": 8,
1035
+ "microblog_posts": 14,
1036
+ "profiles": 14
1037
+ },
1038
+ "forum_threads": [
1039
+ {
1040
+ "author_id": "user_diya",
1041
+ "comments": [
1042
+ {
1043
+ "text": "Following this.",
1044
+ "user_id": "user_jules"
1045
+ },
1046
+ {
1047
+ "text": "Interesting link.",
1048
+ "user_id": "user_1"
1049
+ }
1050
+ ],
1051
+ "thread_id": "thr_0",
1052
+ "topic": "startup"
1053
+ },
1054
+ {
1055
+ "author_id": "user_gita",
1056
+ "comments": [
1057
+ {
1058
+ "text": "Following this.",
1059
+ "user_id": "user_gita"
1060
+ },
1061
+ {
1062
+ "text": "Interesting link.",
1063
+ "user_id": "user_diya"
1064
+ }
1065
+ ],
1066
+ "thread_id": "thr_1",
1067
+ "topic": "infra"
1068
+ },
1069
+ {
1070
+ "author_id": "user_hiro",
1071
+ "comments": [
1072
+ {
1073
+ "text": "Following this.",
1074
+ "user_id": "user_elin"
1075
+ },
1076
+ {
1077
+ "text": "Interesting link.",
1078
+ "user_id": "user_cyrus"
1079
+ }
1080
+ ],
1081
+ "thread_id": "thr_2",
1082
+ "topic": "ai"
1083
+ },
1084
+ {
1085
+ "author_id": "user_gita",
1086
+ "comments": [
1087
+ {
1088
+ "text": "Following this.",
1089
+ "user_id": "user_elin"
1090
+ },
1091
+ {
1092
+ "text": "Interesting link.",
1093
+ "user_id": "user_ivy"
1094
+ }
1095
+ ],
1096
+ "thread_id": "thr_3",
1097
+ "topic": "ai"
1098
+ },
1099
+ {
1100
+ "author_id": "user_gita",
1101
+ "comments": [
1102
+ {
1103
+ "text": "Following this.",
1104
+ "user_id": "user_ivy"
1105
+ },
1106
+ {
1107
+ "text": "Interesting link.",
1108
+ "user_id": "user_cyrus"
1109
+ }
1110
+ ],
1111
+ "thread_id": "thr_4",
1112
+ "topic": "infra"
1113
+ },
1114
+ {
1115
+ "author_id": "user_cyrus",
1116
+ "comments": [
1117
+ {
1118
+ "text": "Following this.",
1119
+ "user_id": "user_aria"
1120
+ },
1121
+ {
1122
+ "text": "Interesting link.",
1123
+ "user_id": "user_jules"
1124
+ }
1125
+ ],
1126
+ "thread_id": "thr_5",
1127
+ "topic": "startup"
1128
+ },
1129
+ {
1130
+ "author_id": "user_aria",
1131
+ "comments": [
1132
+ {
1133
+ "text": "Following this.",
1134
+ "user_id": "user_0"
1135
+ },
1136
+ {
1137
+ "text": "Interesting link.",
1138
+ "user_id": "user_2"
1139
+ }
1140
+ ],
1141
+ "thread_id": "thr_6",
1142
+ "topic": "security"
1143
+ },
1144
+ {
1145
+ "author_id": "user_gita",
1146
+ "comments": [
1147
+ {
1148
+ "text": "Following this.",
1149
+ "user_id": "user_ivy"
1150
+ },
1151
+ {
1152
+ "text": "Interesting link.",
1153
+ "user_id": "user_elin"
1154
+ }
1155
+ ],
1156
+ "thread_id": "thr_7",
1157
+ "topic": "infra"
1158
+ }
1159
+ ],
1160
+ "microblog_posts": [
1161
+ {
1162
+ "canonical_user": "user_diya",
1163
+ "mentions": [
1164
+ "user_3"
1165
+ ],
1166
+ "post_id": "post_0",
1167
+ "text": "Update 0 from Apex Dynamics #hyderabad",
1168
+ "timestamp": 1000,
1169
+ "user_id": "alias_monsoonbyte"
1170
+ },
1171
+ {
1172
+ "canonical_user": "user_hiro",
1173
+ "mentions": [
1174
+ "user_2"
1175
+ ],
1176
+ "post_id": "post_1",
1177
+ "text": "Update 1 from Northbridge #bengaluru",
1178
+ "timestamp": 1001,
1179
+ "user_id": "alias_docksparrow"
1180
+ },
1181
+ {
1182
+ "canonical_user": "user_diya",
1183
+ "mentions": [
1184
+ "user_2"
1185
+ ],
1186
+ "post_id": "post_2",
1187
+ "text": "Update 2 from Northbridge #delhi",
1188
+ "timestamp": 1002,
1189
+ "user_id": "alias_monsoonbyte"
1190
+ },
1191
+ {
1192
+ "canonical_user": "user_3",
1193
+ "mentions": [
1194
+ "user_0"
1195
+ ],
1196
+ "post_id": "post_3",
1197
+ "text": "Update 3 from Northbridge #delhi",
1198
+ "timestamp": 1003,
1199
+ "user_id": "user_3"
1200
+ },
1201
+ {
1202
+ "canonical_user": "user_aria",
1203
+ "mentions": [
1204
+ "user_2"
1205
+ ],
1206
+ "post_id": "post_4",
1207
+ "text": "Update 4 from Helios Labs #sector 9",
1208
+ "timestamp": 1004,
1209
+ "user_id": "user_aria"
1210
+ },
1211
+ {
1212
+ "canonical_user": "user_bharat",
1213
+ "mentions": [
1214
+ "user_2"
1215
+ ],
1216
+ "post_id": "post_5",
1217
+ "text": "Update 5 from Northbridge Logistics #dockyard 17",
1218
+ "timestamp": 1005,
1219
+ "user_id": "alias_steelquill"
1220
+ },
1221
+ {
1222
+ "canonical_user": "user_hiro",
1223
+ "mentions": [
1224
+ "user_3"
1225
+ ],
1226
+ "post_id": "post_6",
1227
+ "text": "Update 6 from Apex Dynamics #old town",
1228
+ "timestamp": 1006,
1229
+ "user_id": "alias_docksparrow"
1230
+ },
1231
+ {
1232
+ "canonical_user": "user_faris",
1233
+ "mentions": [
1234
+ "user_3"
1235
+ ],
1236
+ "post_id": "post_7",
1237
+ "text": "Update 7 from Blueharbor Media #old town",
1238
+ "timestamp": 1007,
1239
+ "user_id": "alias_nightrelay"
1240
+ },
1241
+ {
1242
+ "canonical_user": "user_elin",
1243
+ "mentions": [
1244
+ "user_3"
1245
+ ],
1246
+ "post_id": "post_8",
1247
+ "text": "Update 8 from Helios Labs #sector 9",
1248
+ "timestamp": 1008,
1249
+ "user_id": "user_elin"
1250
+ },
1251
+ {
1252
+ "canonical_user": "user_elin",
1253
+ "mentions": [
1254
+ "user_3"
1255
+ ],
1256
+ "post_id": "post_9",
1257
+ "text": "Update 9 from Tidewatch Ops #rivergate",
1258
+ "timestamp": 1009,
1259
+ "user_id": "alias_mapleghost"
1260
+ },
1261
+ {
1262
+ "canonical_user": "user_gita",
1263
+ "mentions": [
1264
+ "user_0"
1265
+ ],
1266
+ "post_id": "post_10",
1267
+ "text": "Update 10 from Apex Dynamics #old town",
1268
+ "timestamp": 1010,
1269
+ "user_id": "user_gita"
1270
+ },
1271
+ {
1272
+ "canonical_user": "user_hiro",
1273
+ "mentions": [
1274
+ "user_0"
1275
+ ],
1276
+ "post_id": "post_11",
1277
+ "text": "Update 11 from Northbridge Logistics #dockyard 17",
1278
+ "timestamp": 1011,
1279
+ "user_id": "user_hiro"
1280
+ },
1281
+ {
1282
+ "canonical_user": "user_ivy",
1283
+ "mentions": [
1284
+ "user_0"
1285
+ ],
1286
+ "post_id": "post_12",
1287
+ "text": "Update 12 from Kestrel Works #rivergate",
1288
+ "timestamp": 1012,
1289
+ "user_id": "user_ivy"
1290
+ },
1291
+ {
1292
+ "canonical_user": "user_jules",
1293
+ "mentions": [
1294
+ "user_1"
1295
+ ],
1296
+ "post_id": "post_13",
1297
+ "text": "Update 13 from Blueharbor Media #old town",
1298
+ "timestamp": 1013,
1299
+ "user_id": "user_jules"
1300
+ }
1301
+ ],
1302
+ "profiles": [
1303
+ {
1304
+ "connections": [
1305
+ "user_2"
1306
+ ],
1307
+ "location": "Hyderabad",
1308
+ "name": "Person 0",
1309
+ "org": "Apex Dynamics",
1310
+ "user_id": "user_0",
1311
+ "work_history": [
1312
+ "Apex Dynamics"
1313
+ ]
1314
+ },
1315
+ {
1316
+ "connections": [],
1317
+ "location": "Bengaluru",
1318
+ "name": "Person 1",
1319
+ "org": "Northbridge",
1320
+ "user_id": "user_1",
1321
+ "work_history": [
1322
+ "Northbridge"
1323
+ ]
1324
+ },
1325
+ {
1326
+ "connections": [],
1327
+ "location": "Delhi",
1328
+ "name": "Person 2",
1329
+ "org": "Northbridge",
1330
+ "user_id": "user_2",
1331
+ "work_history": [
1332
+ "Northbridge"
1333
+ ]
1334
+ },
1335
+ {
1336
+ "connections": [
1337
+ "user_0"
1338
+ ],
1339
+ "location": "Delhi",
1340
+ "name": "Person 3",
1341
+ "org": "Northbridge",
1342
+ "user_id": "user_3",
1343
+ "work_history": [
1344
+ "Northbridge"
1345
+ ]
1346
+ },
1347
+ {
1348
+ "connections": [
1349
+ "user_cyrus"
1350
+ ],
1351
+ "location": "Sector 9",
1352
+ "name": "Aria Sen",
1353
+ "org": "Helios Labs",
1354
+ "user_id": "user_aria",
1355
+ "work_history": [
1356
+ "Helios Labs"
1357
+ ]
1358
+ },
1359
+ {
1360
+ "connections": [
1361
+ "user_hiro",
1362
+ "event_project_lantern"
1363
+ ],
1364
+ "location": "Dockyard 17",
1365
+ "name": "Bharat Kulkarni",
1366
+ "org": "Northbridge Logistics",
1367
+ "user_id": "user_bharat",
1368
+ "work_history": [
1369
+ "Northbridge Logistics"
1370
+ ]
1371
+ },
1372
+ {
1373
+ "connections": [
1374
+ "user_gita"
1375
+ ],
1376
+ "location": "Old Town",
1377
+ "name": "Cyrus Mehta",
1378
+ "org": "Apex Dynamics",
1379
+ "user_id": "user_cyrus",
1380
+ "work_history": [
1381
+ "Apex Dynamics"
1382
+ ]
1383
+ },
1384
+ {
1385
+ "connections": [
1386
+ "user_elin",
1387
+ "user_ivy",
1388
+ "event_project_lantern"
1389
+ ],
1390
+ "location": "Old Town",
1391
+ "name": "Diya Roy",
1392
+ "org": "Blueharbor Media",
1393
+ "user_id": "user_diya",
1394
+ "work_history": [
1395
+ "Blueharbor Media"
1396
+ ]
1397
+ },
1398
+ {
1399
+ "connections": [
1400
+ "user_aria",
1401
+ "event_black_kite"
1402
+ ],
1403
+ "location": "Sector 9",
1404
+ "name": "Elin Das",
1405
+ "org": "Helios Labs",
1406
+ "user_id": "user_elin",
1407
+ "work_history": [
1408
+ "Helios Labs"
1409
+ ]
1410
+ },
1411
+ {
1412
+ "connections": [
1413
+ "user_diya",
1414
+ "event_project_lantern"
1415
+ ],
1416
+ "location": "Rivergate",
1417
+ "name": "Faris Noor",
1418
+ "org": "Tidewatch Ops",
1419
+ "user_id": "user_faris",
1420
+ "work_history": [
1421
+ "Tidewatch Ops"
1422
+ ]
1423
+ },
1424
+ {
1425
+ "connections": [
1426
+ "user_jules",
1427
+ "event_silent_current"
1428
+ ],
1429
+ "location": "Old Town",
1430
+ "name": "Gita Pradhan",
1431
+ "org": "Apex Dynamics",
1432
+ "user_id": "user_gita",
1433
+ "work_history": [
1434
+ "Apex Dynamics"
1435
+ ]
1436
+ },
1437
+ {
1438
+ "connections": [
1439
+ "user_faris",
1440
+ "event_project_lantern"
1441
+ ],
1442
+ "location": "Dockyard 17",
1443
+ "name": "Hiro Tan",
1444
+ "org": "Northbridge Logistics",
1445
+ "user_id": "user_hiro",
1446
+ "work_history": [
1447
+ "Northbridge Logistics"
1448
+ ]
1449
+ },
1450
+ {
1451
+ "connections": [
1452
+ "user_bharat",
1453
+ "user_elin",
1454
+ "event_black_kite"
1455
+ ],
1456
+ "location": "Rivergate",
1457
+ "name": "Ivy Kapoor",
1458
+ "org": "Kestrel Works",
1459
+ "user_id": "user_ivy",
1460
+ "work_history": [
1461
+ "Kestrel Works"
1462
+ ]
1463
+ },
1464
+ {
1465
+ "connections": [
1466
+ "user_bharat"
1467
+ ],
1468
+ "location": "Old Town",
1469
+ "name": "Jules Banerjee",
1470
+ "org": "Blueharbor Media",
1471
+ "user_id": "user_jules",
1472
+ "work_history": [
1473
+ "Blueharbor Media"
1474
+ ]
1475
+ }
1476
+ ]
1477
+ },
1478
+ "seed_file": "datasets/fixed_levels/seed_fixed_levels.json",
1479
+ "shared_config": "datasets/fixed_levels/shared_config_fixed_levels.json",
1480
+ "task_count": 15,
1481
+ "tasks": [
1482
+ {
1483
+ "answer": "user_ivy",
1484
+ "metadata": {
1485
+ "difficulty": "easy",
1486
+ "difficulty_level": 1,
1487
+ "question_id": "easy_01"
1488
+ },
1489
+ "question": "Which canonical user owns alias alias_orchidfox?",
1490
+ "supporting_edges": [
1491
+ {
1492
+ "confidence": 1.0,
1493
+ "dst": "user_ivy",
1494
+ "rel": "alias_of",
1495
+ "src": "alias_orchidfox"
1496
+ }
1497
+ ],
1498
+ "task_id": "seed_task_0",
1499
+ "task_type": "identity_resolution"
1500
+ },
1501
+ {
1502
+ "answer": "org_northbridge_logistics",
1503
+ "metadata": {
1504
+ "difficulty": "easy",
1505
+ "difficulty_level": 1,
1506
+ "question_id": "easy_02"
1507
+ },
1508
+ "question": "Which organization does user_bharat work at?",
1509
+ "supporting_edges": [
1510
+ {
1511
+ "confidence": 1.0,
1512
+ "dst": "org_northbridge_logistics",
1513
+ "rel": "works_at",
1514
+ "src": "user_bharat"
1515
+ }
1516
+ ],
1517
+ "task_id": "seed_task_1",
1518
+ "task_type": "entity_lookup"
1519
+ },
1520
+ {
1521
+ "answer": "user_bharat",
1522
+ "metadata": {
1523
+ "difficulty": "easy",
1524
+ "difficulty_level": 1,
1525
+ "question_id": "easy_03"
1526
+ },
1527
+ "question": "Who is directly connected to user_ivy in the logistics chain?",
1528
+ "supporting_edges": [
1529
+ {
1530
+ "confidence": 1.0,
1531
+ "dst": "user_bharat",
1532
+ "rel": "connected_to",
1533
+ "src": "user_ivy"
1534
+ }
1535
+ ],
1536
+ "task_id": "seed_task_2",
1537
+ "task_type": "network_discovery"
1538
+ },
1539
+ {
1540
+ "answer": "event_project_lantern",
1541
+ "metadata": {
1542
+ "difficulty": "easy",
1543
+ "difficulty_level": 1,
1544
+ "question_id": "easy_04"
1545
+ },
1546
+ "question": "Which event is discussed in thread thr_supply_leak?",
1547
+ "supporting_edges": [
1548
+ {
1549
+ "confidence": 1.0,
1550
+ "dst": "event_project_lantern",
1551
+ "rel": "discusses",
1552
+ "src": "thr_supply_leak"
1553
+ }
1554
+ ],
1555
+ "task_id": "seed_task_3",
1556
+ "task_type": "event_tracing"
1557
+ },
1558
+ {
1559
+ "answer": "loc_dockyard17",
1560
+ "metadata": {
1561
+ "difficulty": "easy",
1562
+ "difficulty_level": 1,
1563
+ "question_id": "easy_05"
1564
+ },
1565
+ "question": "Which location is referenced by post post_shift_roster?",
1566
+ "supporting_edges": [
1567
+ {
1568
+ "confidence": 1.0,
1569
+ "dst": "loc_dockyard17",
1570
+ "rel": "references",
1571
+ "src": "post_shift_roster"
1572
+ }
1573
+ ],
1574
+ "task_id": "seed_task_4",
1575
+ "task_type": "location_trace"
1576
+ },
1577
+ {
1578
+ "answer": "org_blueharbor_media",
1579
+ "metadata": {
1580
+ "difficulty": "mid",
1581
+ "difficulty_level": 2,
1582
+ "question_id": "mid_01"
1583
+ },
1584
+ "question": "Alias alias_monsoonbyte belongs to which user and where does that user work? Return only the organization node id.",
1585
+ "supporting_edges": [
1586
+ {
1587
+ "confidence": 1.0,
1588
+ "dst": "user_diya",
1589
+ "rel": "alias_of",
1590
+ "src": "alias_monsoonbyte"
1591
+ },
1592
+ {
1593
+ "confidence": 1.0,
1594
+ "dst": "org_blueharbor_media",
1595
+ "rel": "works_at",
1596
+ "src": "user_diya"
1597
+ }
1598
+ ],
1599
+ "task_id": "seed_task_5",
1600
+ "task_type": "identity_resolution"
1601
+ },
1602
+ {
1603
+ "answer": "user_diya",
1604
+ "metadata": {
1605
+ "difficulty": "mid",
1606
+ "difficulty_level": 2,
1607
+ "question_id": "mid_02"
1608
+ },
1609
+ "question": "Which user both authored thread thr_supply_leak and investigates event_project_lantern?",
1610
+ "supporting_edges": [
1611
+ {
1612
+ "confidence": 1.0,
1613
+ "dst": "thr_supply_leak",
1614
+ "rel": "authored_thread",
1615
+ "src": "user_diya"
1616
+ },
1617
+ {
1618
+ "confidence": 1.0,
1619
+ "dst": "event_project_lantern",
1620
+ "rel": "investigates",
1621
+ "src": "user_diya"
1622
+ }
1623
+ ],
1624
+ "task_id": "seed_task_6",
1625
+ "task_type": "event_tracing"
1626
+ },
1627
+ {
1628
+ "answer": "org_northbridge_logistics",
1629
+ "metadata": {
1630
+ "difficulty": "mid",
1631
+ "difficulty_level": 2,
1632
+ "question_id": "mid_03"
1633
+ },
1634
+ "question": "Which organization operates in the location referenced by post_midnight_manifest?",
1635
+ "supporting_edges": [
1636
+ {
1637
+ "confidence": 1.0,
1638
+ "dst": "loc_dockyard17",
1639
+ "rel": "references",
1640
+ "src": "post_midnight_manifest"
1641
+ },
1642
+ {
1643
+ "confidence": 1.0,
1644
+ "dst": "loc_dockyard17",
1645
+ "rel": "operates_in",
1646
+ "src": "org_northbridge_logistics"
1647
+ }
1648
+ ],
1649
+ "task_id": "seed_task_7",
1650
+ "task_type": "cross_platform_linking"
1651
+ },
1652
+ {
1653
+ "answer": "user_bharat",
1654
+ "metadata": {
1655
+ "difficulty": "mid",
1656
+ "difficulty_level": 2,
1657
+ "question_id": "mid_04"
1658
+ },
1659
+ "question": "user_ivy is directly connected to which collaborator on event_project_lantern?",
1660
+ "supporting_edges": [
1661
+ {
1662
+ "confidence": 1.0,
1663
+ "dst": "user_bharat",
1664
+ "rel": "connected_to",
1665
+ "src": "user_ivy"
1666
+ },
1667
+ {
1668
+ "confidence": 1.0,
1669
+ "dst": "event_project_lantern",
1670
+ "rel": "collaborates_on",
1671
+ "src": "user_bharat"
1672
+ }
1673
+ ],
1674
+ "task_id": "seed_task_8",
1675
+ "task_type": "network_discovery"
1676
+ },
1677
+ {
1678
+ "answer": "user_hiro",
1679
+ "metadata": {
1680
+ "difficulty": "mid",
1681
+ "difficulty_level": 2,
1682
+ "question_id": "mid_05"
1683
+ },
1684
+ "question": "Which canonical user is behind alias_docksparrow and collaborates on event_project_lantern?",
1685
+ "supporting_edges": [
1686
+ {
1687
+ "confidence": 1.0,
1688
+ "dst": "user_hiro",
1689
+ "rel": "alias_of",
1690
+ "src": "alias_docksparrow"
1691
+ },
1692
+ {
1693
+ "confidence": 1.0,
1694
+ "dst": "event_project_lantern",
1695
+ "rel": "collaborates_on",
1696
+ "src": "user_hiro"
1697
+ }
1698
+ ],
1699
+ "task_id": "seed_task_9",
1700
+ "task_type": "deanonymization"
1701
+ },
1702
+ {
1703
+ "answer": "user_bharat",
1704
+ "metadata": {
1705
+ "difficulty": "high",
1706
+ "difficulty_level": 3,
1707
+ "question_id": "high_01"
1708
+ },
1709
+ "question": "An alias authored post_midnight_manifest referencing loc_dockyard17; through a direct connection from that alias owner, which user collaborates on event_project_lantern?",
1710
+ "supporting_edges": [
1711
+ {
1712
+ "confidence": 1.0,
1713
+ "dst": "user_ivy",
1714
+ "rel": "alias_of",
1715
+ "src": "alias_orchidfox"
1716
+ },
1717
+ {
1718
+ "confidence": 1.0,
1719
+ "dst": "post_midnight_manifest",
1720
+ "rel": "authored_post",
1721
+ "src": "alias_orchidfox"
1722
+ },
1723
+ {
1724
+ "confidence": 1.0,
1725
+ "dst": "loc_dockyard17",
1726
+ "rel": "references",
1727
+ "src": "post_midnight_manifest"
1728
+ },
1729
+ {
1730
+ "confidence": 1.0,
1731
+ "dst": "user_bharat",
1732
+ "rel": "connected_to",
1733
+ "src": "user_ivy"
1734
+ },
1735
+ {
1736
+ "confidence": 1.0,
1737
+ "dst": "event_project_lantern",
1738
+ "rel": "collaborates_on",
1739
+ "src": "user_bharat"
1740
+ }
1741
+ ],
1742
+ "task_id": "seed_task_10",
1743
+ "task_type": "convoluted_trace"
1744
+ },
1745
+ {
1746
+ "answer": "user_hiro",
1747
+ "metadata": {
1748
+ "difficulty": "high",
1749
+ "difficulty_level": 3,
1750
+ "question_id": "high_02"
1751
+ },
1752
+ "question": "Thread thr_supply_leak references org_northbridge_logistics. Identify the canonical user behind alias_docksparrow who works there and collaborates on event_project_lantern.",
1753
+ "supporting_edges": [
1754
+ {
1755
+ "confidence": 1.0,
1756
+ "dst": "org_northbridge_logistics",
1757
+ "rel": "references",
1758
+ "src": "thr_supply_leak"
1759
+ },
1760
+ {
1761
+ "confidence": 1.0,
1762
+ "dst": "user_hiro",
1763
+ "rel": "alias_of",
1764
+ "src": "alias_docksparrow"
1765
+ },
1766
+ {
1767
+ "confidence": 1.0,
1768
+ "dst": "org_northbridge_logistics",
1769
+ "rel": "works_at",
1770
+ "src": "user_hiro"
1771
+ },
1772
+ {
1773
+ "confidence": 1.0,
1774
+ "dst": "event_project_lantern",
1775
+ "rel": "collaborates_on",
1776
+ "src": "user_hiro"
1777
+ }
1778
+ ],
1779
+ "task_id": "seed_task_11",
1780
+ "task_type": "convoluted_trace"
1781
+ },
1782
+ {
1783
+ "answer": "user_diya",
1784
+ "metadata": {
1785
+ "difficulty": "high",
1786
+ "difficulty_level": 3,
1787
+ "question_id": "high_03"
1788
+ },
1789
+ "question": "Cross-platform Black Kite linkage: alias_monsoonbyte authored post_drone_parts referencing event_black_kite. Which canonical user behind that alias is directly connected to the Kestrel Works collaborator on the same event?",
1790
+ "supporting_edges": [
1791
+ {
1792
+ "confidence": 1.0,
1793
+ "dst": "user_diya",
1794
+ "rel": "alias_of",
1795
+ "src": "alias_monsoonbyte"
1796
+ },
1797
+ {
1798
+ "confidence": 1.0,
1799
+ "dst": "post_drone_parts",
1800
+ "rel": "authored_post",
1801
+ "src": "alias_monsoonbyte"
1802
+ },
1803
+ {
1804
+ "confidence": 1.0,
1805
+ "dst": "event_black_kite",
1806
+ "rel": "references",
1807
+ "src": "post_drone_parts"
1808
+ },
1809
+ {
1810
+ "confidence": 1.0,
1811
+ "dst": "org_kestrel_works",
1812
+ "rel": "works_at",
1813
+ "src": "user_ivy"
1814
+ },
1815
+ {
1816
+ "confidence": 1.0,
1817
+ "dst": "event_black_kite",
1818
+ "rel": "collaborates_on",
1819
+ "src": "user_ivy"
1820
+ },
1821
+ {
1822
+ "confidence": 1.0,
1823
+ "dst": "user_ivy",
1824
+ "rel": "connected_to",
1825
+ "src": "user_diya"
1826
+ }
1827
+ ],
1828
+ "task_id": "seed_task_12",
1829
+ "task_type": "convoluted_trace"
1830
+ },
1831
+ {
1832
+ "answer": "user_faris",
1833
+ "metadata": {
1834
+ "difficulty": "high",
1835
+ "difficulty_level": 3,
1836
+ "question_id": "high_04"
1837
+ },
1838
+ "question": "A sat-phone ping alias references loc_rivergate. Which canonical user behind alias_nightrelay works at an organization operating there and collaborates on event_project_lantern?",
1839
+ "supporting_edges": [
1840
+ {
1841
+ "confidence": 1.0,
1842
+ "dst": "user_faris",
1843
+ "rel": "alias_of",
1844
+ "src": "alias_nightrelay"
1845
+ },
1846
+ {
1847
+ "confidence": 1.0,
1848
+ "dst": "post_sat_phone_ping",
1849
+ "rel": "authored_post",
1850
+ "src": "alias_nightrelay"
1851
+ },
1852
+ {
1853
+ "confidence": 1.0,
1854
+ "dst": "loc_rivergate",
1855
+ "rel": "references",
1856
+ "src": "post_sat_phone_ping"
1857
+ },
1858
+ {
1859
+ "confidence": 1.0,
1860
+ "dst": "org_tidewatch_ops",
1861
+ "rel": "works_at",
1862
+ "src": "user_faris"
1863
+ },
1864
+ {
1865
+ "confidence": 1.0,
1866
+ "dst": "loc_rivergate",
1867
+ "rel": "operates_in",
1868
+ "src": "org_tidewatch_ops"
1869
+ },
1870
+ {
1871
+ "confidence": 1.0,
1872
+ "dst": "event_project_lantern",
1873
+ "rel": "collaborates_on",
1874
+ "src": "user_faris"
1875
+ }
1876
+ ],
1877
+ "task_id": "seed_task_13",
1878
+ "task_type": "convoluted_trace"
1879
+ },
1880
+ {
1881
+ "answer": "user_ivy",
1882
+ "metadata": {
1883
+ "difficulty": "high",
1884
+ "difficulty_level": 3,
1885
+ "question_id": "high_05"
1886
+ },
1887
+ "question": "From thread thr_port_audit discussing event_black_kite and referencing org_kestrel_works, identify the canonical user whose alias authored post_midnight_manifest and who collaborates on event_black_kite.",
1888
+ "supporting_edges": [
1889
+ {
1890
+ "confidence": 1.0,
1891
+ "dst": "event_black_kite",
1892
+ "rel": "discusses",
1893
+ "src": "thr_port_audit"
1894
+ },
1895
+ {
1896
+ "confidence": 1.0,
1897
+ "dst": "org_kestrel_works",
1898
+ "rel": "references",
1899
+ "src": "thr_port_audit"
1900
+ },
1901
+ {
1902
+ "confidence": 1.0,
1903
+ "dst": "user_ivy",
1904
+ "rel": "alias_of",
1905
+ "src": "alias_orchidfox"
1906
+ },
1907
+ {
1908
+ "confidence": 1.0,
1909
+ "dst": "post_midnight_manifest",
1910
+ "rel": "authored_post",
1911
+ "src": "alias_orchidfox"
1912
+ },
1913
+ {
1914
+ "confidence": 1.0,
1915
+ "dst": "org_kestrel_works",
1916
+ "rel": "works_at",
1917
+ "src": "user_ivy"
1918
+ },
1919
+ {
1920
+ "confidence": 1.0,
1921
+ "dst": "event_black_kite",
1922
+ "rel": "collaborates_on",
1923
+ "src": "user_ivy"
1924
+ }
1925
+ ],
1926
+ "task_id": "seed_task_14",
1927
+ "task_type": "convoluted_trace"
1928
+ }
1929
+ ]
1930
+ }
datasets/fixed_levels/fixed_graph_questions.json ADDED
@@ -0,0 +1,1172 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_name": "fixed_levels_submission_set",
3
+ "difficulty_counts": {
4
+ "easy": 5,
5
+ "high": 5,
6
+ "mid": 5
7
+ },
8
+ "graph": {
9
+ "edge_count": 71,
10
+ "edges": [
11
+ {
12
+ "confidence": 1.0,
13
+ "dst": "user_ivy",
14
+ "rel": "alias_of",
15
+ "src": "alias_orchidfox"
16
+ },
17
+ {
18
+ "confidence": 1.0,
19
+ "dst": "user_bharat",
20
+ "rel": "alias_of",
21
+ "src": "alias_steelquill"
22
+ },
23
+ {
24
+ "confidence": 1.0,
25
+ "dst": "user_diya",
26
+ "rel": "alias_of",
27
+ "src": "alias_monsoonbyte"
28
+ },
29
+ {
30
+ "confidence": 1.0,
31
+ "dst": "user_faris",
32
+ "rel": "alias_of",
33
+ "src": "alias_nightrelay"
34
+ },
35
+ {
36
+ "confidence": 1.0,
37
+ "dst": "user_elin",
38
+ "rel": "alias_of",
39
+ "src": "alias_mapleghost"
40
+ },
41
+ {
42
+ "confidence": 1.0,
43
+ "dst": "user_hiro",
44
+ "rel": "alias_of",
45
+ "src": "alias_docksparrow"
46
+ },
47
+ {
48
+ "confidence": 1.0,
49
+ "dst": "user_cyrus",
50
+ "rel": "alias_of",
51
+ "src": "alias_quartzlotus"
52
+ },
53
+ {
54
+ "confidence": 1.0,
55
+ "dst": "org_helios_labs",
56
+ "rel": "works_at",
57
+ "src": "user_aria"
58
+ },
59
+ {
60
+ "confidence": 1.0,
61
+ "dst": "org_northbridge_logistics",
62
+ "rel": "works_at",
63
+ "src": "user_bharat"
64
+ },
65
+ {
66
+ "confidence": 1.0,
67
+ "dst": "org_apex_dynamics",
68
+ "rel": "works_at",
69
+ "src": "user_cyrus"
70
+ },
71
+ {
72
+ "confidence": 1.0,
73
+ "dst": "org_blueharbor_media",
74
+ "rel": "works_at",
75
+ "src": "user_diya"
76
+ },
77
+ {
78
+ "confidence": 1.0,
79
+ "dst": "org_helios_labs",
80
+ "rel": "works_at",
81
+ "src": "user_elin"
82
+ },
83
+ {
84
+ "confidence": 1.0,
85
+ "dst": "org_tidewatch_ops",
86
+ "rel": "works_at",
87
+ "src": "user_faris"
88
+ },
89
+ {
90
+ "confidence": 1.0,
91
+ "dst": "org_apex_dynamics",
92
+ "rel": "works_at",
93
+ "src": "user_gita"
94
+ },
95
+ {
96
+ "confidence": 1.0,
97
+ "dst": "org_northbridge_logistics",
98
+ "rel": "works_at",
99
+ "src": "user_hiro"
100
+ },
101
+ {
102
+ "confidence": 1.0,
103
+ "dst": "org_kestrel_works",
104
+ "rel": "works_at",
105
+ "src": "user_ivy"
106
+ },
107
+ {
108
+ "confidence": 1.0,
109
+ "dst": "org_blueharbor_media",
110
+ "rel": "works_at",
111
+ "src": "user_jules"
112
+ },
113
+ {
114
+ "confidence": 1.0,
115
+ "dst": "loc_sector9",
116
+ "rel": "located_in",
117
+ "src": "user_aria"
118
+ },
119
+ {
120
+ "confidence": 1.0,
121
+ "dst": "loc_dockyard17",
122
+ "rel": "located_in",
123
+ "src": "user_bharat"
124
+ },
125
+ {
126
+ "confidence": 1.0,
127
+ "dst": "loc_old_town",
128
+ "rel": "located_in",
129
+ "src": "user_cyrus"
130
+ },
131
+ {
132
+ "confidence": 1.0,
133
+ "dst": "loc_old_town",
134
+ "rel": "located_in",
135
+ "src": "user_diya"
136
+ },
137
+ {
138
+ "confidence": 1.0,
139
+ "dst": "loc_sector9",
140
+ "rel": "located_in",
141
+ "src": "user_elin"
142
+ },
143
+ {
144
+ "confidence": 1.0,
145
+ "dst": "loc_rivergate",
146
+ "rel": "located_in",
147
+ "src": "user_faris"
148
+ },
149
+ {
150
+ "confidence": 1.0,
151
+ "dst": "loc_old_town",
152
+ "rel": "located_in",
153
+ "src": "user_gita"
154
+ },
155
+ {
156
+ "confidence": 1.0,
157
+ "dst": "loc_dockyard17",
158
+ "rel": "located_in",
159
+ "src": "user_hiro"
160
+ },
161
+ {
162
+ "confidence": 1.0,
163
+ "dst": "loc_rivergate",
164
+ "rel": "located_in",
165
+ "src": "user_ivy"
166
+ },
167
+ {
168
+ "confidence": 1.0,
169
+ "dst": "loc_old_town",
170
+ "rel": "located_in",
171
+ "src": "user_jules"
172
+ },
173
+ {
174
+ "confidence": 1.0,
175
+ "dst": "loc_sector9",
176
+ "rel": "operates_in",
177
+ "src": "org_helios_labs"
178
+ },
179
+ {
180
+ "confidence": 1.0,
181
+ "dst": "loc_dockyard17",
182
+ "rel": "operates_in",
183
+ "src": "org_northbridge_logistics"
184
+ },
185
+ {
186
+ "confidence": 1.0,
187
+ "dst": "loc_old_town",
188
+ "rel": "operates_in",
189
+ "src": "org_apex_dynamics"
190
+ },
191
+ {
192
+ "confidence": 1.0,
193
+ "dst": "loc_old_town",
194
+ "rel": "operates_in",
195
+ "src": "org_blueharbor_media"
196
+ },
197
+ {
198
+ "confidence": 1.0,
199
+ "dst": "loc_rivergate",
200
+ "rel": "operates_in",
201
+ "src": "org_tidewatch_ops"
202
+ },
203
+ {
204
+ "confidence": 1.0,
205
+ "dst": "loc_rivergate",
206
+ "rel": "operates_in",
207
+ "src": "org_kestrel_works"
208
+ },
209
+ {
210
+ "confidence": 0.95,
211
+ "dst": "user_bharat",
212
+ "rel": "connected_to",
213
+ "src": "user_ivy"
214
+ },
215
+ {
216
+ "confidence": 0.95,
217
+ "dst": "user_hiro",
218
+ "rel": "connected_to",
219
+ "src": "user_bharat"
220
+ },
221
+ {
222
+ "confidence": 0.9,
223
+ "dst": "user_faris",
224
+ "rel": "connected_to",
225
+ "src": "user_hiro"
226
+ },
227
+ {
228
+ "confidence": 0.9,
229
+ "dst": "user_diya",
230
+ "rel": "connected_to",
231
+ "src": "user_faris"
232
+ },
233
+ {
234
+ "confidence": 0.88,
235
+ "dst": "user_elin",
236
+ "rel": "connected_to",
237
+ "src": "user_diya"
238
+ },
239
+ {
240
+ "confidence": 0.85,
241
+ "dst": "user_aria",
242
+ "rel": "connected_to",
243
+ "src": "user_elin"
244
+ },
245
+ {
246
+ "confidence": 0.82,
247
+ "dst": "user_cyrus",
248
+ "rel": "connected_to",
249
+ "src": "user_aria"
250
+ },
251
+ {
252
+ "confidence": 0.82,
253
+ "dst": "user_gita",
254
+ "rel": "connected_to",
255
+ "src": "user_cyrus"
256
+ },
257
+ {
258
+ "confidence": 0.8,
259
+ "dst": "user_jules",
260
+ "rel": "connected_to",
261
+ "src": "user_gita"
262
+ },
263
+ {
264
+ "confidence": 0.8,
265
+ "dst": "user_bharat",
266
+ "rel": "connected_to",
267
+ "src": "user_jules"
268
+ },
269
+ {
270
+ "confidence": 0.9,
271
+ "dst": "user_ivy",
272
+ "rel": "connected_to",
273
+ "src": "user_diya"
274
+ },
275
+ {
276
+ "confidence": 0.86,
277
+ "dst": "user_elin",
278
+ "rel": "connected_to",
279
+ "src": "user_ivy"
280
+ },
281
+ {
282
+ "confidence": 1.0,
283
+ "dst": "post_midnight_manifest",
284
+ "rel": "authored_post",
285
+ "src": "alias_orchidfox"
286
+ },
287
+ {
288
+ "confidence": 1.0,
289
+ "dst": "post_shift_roster",
290
+ "rel": "authored_post",
291
+ "src": "alias_docksparrow"
292
+ },
293
+ {
294
+ "confidence": 1.0,
295
+ "dst": "post_sat_phone_ping",
296
+ "rel": "authored_post",
297
+ "src": "alias_nightrelay"
298
+ },
299
+ {
300
+ "confidence": 1.0,
301
+ "dst": "post_drone_parts",
302
+ "rel": "authored_post",
303
+ "src": "alias_monsoonbyte"
304
+ },
305
+ {
306
+ "confidence": 1.0,
307
+ "dst": "post_relay_schedule",
308
+ "rel": "authored_post",
309
+ "src": "alias_steelquill"
310
+ },
311
+ {
312
+ "confidence": 1.0,
313
+ "dst": "loc_dockyard17",
314
+ "rel": "references",
315
+ "src": "post_midnight_manifest"
316
+ },
317
+ {
318
+ "confidence": 1.0,
319
+ "dst": "event_project_lantern",
320
+ "rel": "references",
321
+ "src": "post_midnight_manifest"
322
+ },
323
+ {
324
+ "confidence": 1.0,
325
+ "dst": "loc_dockyard17",
326
+ "rel": "references",
327
+ "src": "post_shift_roster"
328
+ },
329
+ {
330
+ "confidence": 1.0,
331
+ "dst": "loc_rivergate",
332
+ "rel": "references",
333
+ "src": "post_sat_phone_ping"
334
+ },
335
+ {
336
+ "confidence": 1.0,
337
+ "dst": "event_black_kite",
338
+ "rel": "references",
339
+ "src": "post_drone_parts"
340
+ },
341
+ {
342
+ "confidence": 1.0,
343
+ "dst": "event_project_lantern",
344
+ "rel": "references",
345
+ "src": "post_relay_schedule"
346
+ },
347
+ {
348
+ "confidence": 1.0,
349
+ "dst": "thr_supply_leak",
350
+ "rel": "authored_thread",
351
+ "src": "user_diya"
352
+ },
353
+ {
354
+ "confidence": 1.0,
355
+ "dst": "thr_port_audit",
356
+ "rel": "authored_thread",
357
+ "src": "user_jules"
358
+ },
359
+ {
360
+ "confidence": 1.0,
361
+ "dst": "event_project_lantern",
362
+ "rel": "discusses",
363
+ "src": "thr_supply_leak"
364
+ },
365
+ {
366
+ "confidence": 1.0,
367
+ "dst": "org_northbridge_logistics",
368
+ "rel": "references",
369
+ "src": "thr_supply_leak"
370
+ },
371
+ {
372
+ "confidence": 1.0,
373
+ "dst": "event_black_kite",
374
+ "rel": "discusses",
375
+ "src": "thr_port_audit"
376
+ },
377
+ {
378
+ "confidence": 1.0,
379
+ "dst": "org_kestrel_works",
380
+ "rel": "references",
381
+ "src": "thr_port_audit"
382
+ },
383
+ {
384
+ "confidence": 0.95,
385
+ "dst": "event_project_lantern",
386
+ "rel": "collaborates_on",
387
+ "src": "user_bharat"
388
+ },
389
+ {
390
+ "confidence": 0.95,
391
+ "dst": "event_project_lantern",
392
+ "rel": "collaborates_on",
393
+ "src": "user_hiro"
394
+ },
395
+ {
396
+ "confidence": 0.9,
397
+ "dst": "event_project_lantern",
398
+ "rel": "collaborates_on",
399
+ "src": "user_faris"
400
+ },
401
+ {
402
+ "confidence": 0.9,
403
+ "dst": "event_project_lantern",
404
+ "rel": "investigates",
405
+ "src": "user_diya"
406
+ },
407
+ {
408
+ "confidence": 0.94,
409
+ "dst": "event_black_kite",
410
+ "rel": "collaborates_on",
411
+ "src": "user_ivy"
412
+ },
413
+ {
414
+ "confidence": 0.9,
415
+ "dst": "event_black_kite",
416
+ "rel": "collaborates_on",
417
+ "src": "user_cyrus"
418
+ },
419
+ {
420
+ "confidence": 0.88,
421
+ "dst": "event_black_kite",
422
+ "rel": "investigates",
423
+ "src": "user_elin"
424
+ },
425
+ {
426
+ "confidence": 0.86,
427
+ "dst": "event_silent_current",
428
+ "rel": "monitors",
429
+ "src": "user_gita"
430
+ },
431
+ {
432
+ "confidence": 0.86,
433
+ "dst": "event_silent_current",
434
+ "rel": "reports_on",
435
+ "src": "user_jules"
436
+ }
437
+ ],
438
+ "node_count": 37,
439
+ "nodes": [
440
+ {
441
+ "attrs": {
442
+ "location": "Sector 9",
443
+ "name": "Aria Sen",
444
+ "org": "Helios Labs"
445
+ },
446
+ "node_id": "user_aria",
447
+ "node_type": "user"
448
+ },
449
+ {
450
+ "attrs": {
451
+ "location": "Dockyard 17",
452
+ "name": "Bharat Kulkarni",
453
+ "org": "Northbridge Logistics"
454
+ },
455
+ "node_id": "user_bharat",
456
+ "node_type": "user"
457
+ },
458
+ {
459
+ "attrs": {
460
+ "location": "Old Town",
461
+ "name": "Cyrus Mehta",
462
+ "org": "Apex Dynamics"
463
+ },
464
+ "node_id": "user_cyrus",
465
+ "node_type": "user"
466
+ },
467
+ {
468
+ "attrs": {
469
+ "location": "Old Town",
470
+ "name": "Diya Roy",
471
+ "org": "Blueharbor Media"
472
+ },
473
+ "node_id": "user_diya",
474
+ "node_type": "user"
475
+ },
476
+ {
477
+ "attrs": {
478
+ "location": "Sector 9",
479
+ "name": "Elin Das",
480
+ "org": "Helios Labs"
481
+ },
482
+ "node_id": "user_elin",
483
+ "node_type": "user"
484
+ },
485
+ {
486
+ "attrs": {
487
+ "location": "Rivergate",
488
+ "name": "Faris Noor",
489
+ "org": "Tidewatch Ops"
490
+ },
491
+ "node_id": "user_faris",
492
+ "node_type": "user"
493
+ },
494
+ {
495
+ "attrs": {
496
+ "location": "Old Town",
497
+ "name": "Gita Pradhan",
498
+ "org": "Apex Dynamics"
499
+ },
500
+ "node_id": "user_gita",
501
+ "node_type": "user"
502
+ },
503
+ {
504
+ "attrs": {
505
+ "location": "Dockyard 17",
506
+ "name": "Hiro Tan",
507
+ "org": "Northbridge Logistics"
508
+ },
509
+ "node_id": "user_hiro",
510
+ "node_type": "user"
511
+ },
512
+ {
513
+ "attrs": {
514
+ "location": "Rivergate",
515
+ "name": "Ivy Kapoor",
516
+ "org": "Kestrel Works"
517
+ },
518
+ "node_id": "user_ivy",
519
+ "node_type": "user"
520
+ },
521
+ {
522
+ "attrs": {
523
+ "location": "Old Town",
524
+ "name": "Jules Banerjee",
525
+ "org": "Blueharbor Media"
526
+ },
527
+ "node_id": "user_jules",
528
+ "node_type": "user"
529
+ },
530
+ {
531
+ "attrs": {
532
+ "handle": "@orchidfox"
533
+ },
534
+ "node_id": "alias_orchidfox",
535
+ "node_type": "alias"
536
+ },
537
+ {
538
+ "attrs": {
539
+ "handle": "@steelquill"
540
+ },
541
+ "node_id": "alias_steelquill",
542
+ "node_type": "alias"
543
+ },
544
+ {
545
+ "attrs": {
546
+ "handle": "@monsoonbyte"
547
+ },
548
+ "node_id": "alias_monsoonbyte",
549
+ "node_type": "alias"
550
+ },
551
+ {
552
+ "attrs": {
553
+ "handle": "@nightrelay"
554
+ },
555
+ "node_id": "alias_nightrelay",
556
+ "node_type": "alias"
557
+ },
558
+ {
559
+ "attrs": {
560
+ "handle": "@mapleghost"
561
+ },
562
+ "node_id": "alias_mapleghost",
563
+ "node_type": "alias"
564
+ },
565
+ {
566
+ "attrs": {
567
+ "handle": "@docksparrow"
568
+ },
569
+ "node_id": "alias_docksparrow",
570
+ "node_type": "alias"
571
+ },
572
+ {
573
+ "attrs": {
574
+ "handle": "@quartzlotus"
575
+ },
576
+ "node_id": "alias_quartzlotus",
577
+ "node_type": "alias"
578
+ },
579
+ {
580
+ "attrs": {
581
+ "name": "Helios Labs"
582
+ },
583
+ "node_id": "org_helios_labs",
584
+ "node_type": "org"
585
+ },
586
+ {
587
+ "attrs": {
588
+ "name": "Northbridge Logistics"
589
+ },
590
+ "node_id": "org_northbridge_logistics",
591
+ "node_type": "org"
592
+ },
593
+ {
594
+ "attrs": {
595
+ "name": "Apex Dynamics"
596
+ },
597
+ "node_id": "org_apex_dynamics",
598
+ "node_type": "org"
599
+ },
600
+ {
601
+ "attrs": {
602
+ "name": "Blueharbor Media"
603
+ },
604
+ "node_id": "org_blueharbor_media",
605
+ "node_type": "org"
606
+ },
607
+ {
608
+ "attrs": {
609
+ "name": "Tidewatch Ops"
610
+ },
611
+ "node_id": "org_tidewatch_ops",
612
+ "node_type": "org"
613
+ },
614
+ {
615
+ "attrs": {
616
+ "name": "Kestrel Works"
617
+ },
618
+ "node_id": "org_kestrel_works",
619
+ "node_type": "org"
620
+ },
621
+ {
622
+ "attrs": {
623
+ "name": "Dockyard 17"
624
+ },
625
+ "node_id": "loc_dockyard17",
626
+ "node_type": "location"
627
+ },
628
+ {
629
+ "attrs": {
630
+ "name": "Sector 9"
631
+ },
632
+ "node_id": "loc_sector9",
633
+ "node_type": "location"
634
+ },
635
+ {
636
+ "attrs": {
637
+ "name": "Old Town"
638
+ },
639
+ "node_id": "loc_old_town",
640
+ "node_type": "location"
641
+ },
642
+ {
643
+ "attrs": {
644
+ "name": "Rivergate"
645
+ },
646
+ "node_id": "loc_rivergate",
647
+ "node_type": "location"
648
+ },
649
+ {
650
+ "attrs": {
651
+ "name": "Project Lantern"
652
+ },
653
+ "node_id": "event_project_lantern",
654
+ "node_type": "event"
655
+ },
656
+ {
657
+ "attrs": {
658
+ "name": "Black Kite"
659
+ },
660
+ "node_id": "event_black_kite",
661
+ "node_type": "event"
662
+ },
663
+ {
664
+ "attrs": {
665
+ "name": "Silent Current"
666
+ },
667
+ "node_id": "event_silent_current",
668
+ "node_type": "event"
669
+ },
670
+ {
671
+ "attrs": {
672
+ "topic": "supply_chain"
673
+ },
674
+ "node_id": "thr_supply_leak",
675
+ "node_type": "thread"
676
+ },
677
+ {
678
+ "attrs": {
679
+ "topic": "port_audit"
680
+ },
681
+ "node_id": "thr_port_audit",
682
+ "node_type": "thread"
683
+ },
684
+ {
685
+ "attrs": {
686
+ "channel": "microblog"
687
+ },
688
+ "node_id": "post_shift_roster",
689
+ "node_type": "post"
690
+ },
691
+ {
692
+ "attrs": {
693
+ "channel": "microblog"
694
+ },
695
+ "node_id": "post_midnight_manifest",
696
+ "node_type": "post"
697
+ },
698
+ {
699
+ "attrs": {
700
+ "channel": "microblog"
701
+ },
702
+ "node_id": "post_sat_phone_ping",
703
+ "node_type": "post"
704
+ },
705
+ {
706
+ "attrs": {
707
+ "channel": "microblog"
708
+ },
709
+ "node_id": "post_drone_parts",
710
+ "node_type": "post"
711
+ },
712
+ {
713
+ "attrs": {
714
+ "channel": "microblog"
715
+ },
716
+ "node_id": "post_relay_schedule",
717
+ "node_type": "post"
718
+ }
719
+ ]
720
+ },
721
+ "question_count": 15,
722
+ "questions": [
723
+ {
724
+ "answer": "user_ivy",
725
+ "metadata": {
726
+ "difficulty": "easy",
727
+ "difficulty_level": 1,
728
+ "question_id": "easy_01"
729
+ },
730
+ "question": "Which canonical user owns alias alias_orchidfox?",
731
+ "supporting_edges": [
732
+ {
733
+ "confidence": 1.0,
734
+ "dst": "user_ivy",
735
+ "rel": "alias_of",
736
+ "src": "alias_orchidfox"
737
+ }
738
+ ],
739
+ "task_id": "fixed_task_00",
740
+ "task_type": "identity_resolution"
741
+ },
742
+ {
743
+ "answer": "org_northbridge_logistics",
744
+ "metadata": {
745
+ "difficulty": "easy",
746
+ "difficulty_level": 1,
747
+ "question_id": "easy_02"
748
+ },
749
+ "question": "Which organization does user_bharat work at?",
750
+ "supporting_edges": [
751
+ {
752
+ "confidence": 1.0,
753
+ "dst": "org_northbridge_logistics",
754
+ "rel": "works_at",
755
+ "src": "user_bharat"
756
+ }
757
+ ],
758
+ "task_id": "fixed_task_01",
759
+ "task_type": "entity_lookup"
760
+ },
761
+ {
762
+ "answer": "user_bharat",
763
+ "metadata": {
764
+ "difficulty": "easy",
765
+ "difficulty_level": 1,
766
+ "question_id": "easy_03"
767
+ },
768
+ "question": "Who is directly connected to user_ivy in the logistics chain?",
769
+ "supporting_edges": [
770
+ {
771
+ "confidence": 1.0,
772
+ "dst": "user_bharat",
773
+ "rel": "connected_to",
774
+ "src": "user_ivy"
775
+ }
776
+ ],
777
+ "task_id": "fixed_task_02",
778
+ "task_type": "network_discovery"
779
+ },
780
+ {
781
+ "answer": "event_project_lantern",
782
+ "metadata": {
783
+ "difficulty": "easy",
784
+ "difficulty_level": 1,
785
+ "question_id": "easy_04"
786
+ },
787
+ "question": "Which event is discussed in thread thr_supply_leak?",
788
+ "supporting_edges": [
789
+ {
790
+ "confidence": 1.0,
791
+ "dst": "event_project_lantern",
792
+ "rel": "discusses",
793
+ "src": "thr_supply_leak"
794
+ }
795
+ ],
796
+ "task_id": "fixed_task_03",
797
+ "task_type": "event_tracing"
798
+ },
799
+ {
800
+ "answer": "loc_dockyard17",
801
+ "metadata": {
802
+ "difficulty": "easy",
803
+ "difficulty_level": 1,
804
+ "question_id": "easy_05"
805
+ },
806
+ "question": "Which location is referenced by post post_shift_roster?",
807
+ "supporting_edges": [
808
+ {
809
+ "confidence": 1.0,
810
+ "dst": "loc_dockyard17",
811
+ "rel": "references",
812
+ "src": "post_shift_roster"
813
+ }
814
+ ],
815
+ "task_id": "fixed_task_04",
816
+ "task_type": "location_trace"
817
+ },
818
+ {
819
+ "answer": "org_blueharbor_media",
820
+ "metadata": {
821
+ "difficulty": "mid",
822
+ "difficulty_level": 2,
823
+ "question_id": "mid_01"
824
+ },
825
+ "question": "Alias alias_monsoonbyte belongs to which user and where does that user work? Return only the organization node id.",
826
+ "supporting_edges": [
827
+ {
828
+ "confidence": 1.0,
829
+ "dst": "user_diya",
830
+ "rel": "alias_of",
831
+ "src": "alias_monsoonbyte"
832
+ },
833
+ {
834
+ "confidence": 1.0,
835
+ "dst": "org_blueharbor_media",
836
+ "rel": "works_at",
837
+ "src": "user_diya"
838
+ }
839
+ ],
840
+ "task_id": "fixed_task_05",
841
+ "task_type": "identity_resolution"
842
+ },
843
+ {
844
+ "answer": "user_diya",
845
+ "metadata": {
846
+ "difficulty": "mid",
847
+ "difficulty_level": 2,
848
+ "question_id": "mid_02"
849
+ },
850
+ "question": "Which user both authored thread thr_supply_leak and investigates event_project_lantern?",
851
+ "supporting_edges": [
852
+ {
853
+ "confidence": 1.0,
854
+ "dst": "thr_supply_leak",
855
+ "rel": "authored_thread",
856
+ "src": "user_diya"
857
+ },
858
+ {
859
+ "confidence": 1.0,
860
+ "dst": "event_project_lantern",
861
+ "rel": "investigates",
862
+ "src": "user_diya"
863
+ }
864
+ ],
865
+ "task_id": "fixed_task_06",
866
+ "task_type": "event_tracing"
867
+ },
868
+ {
869
+ "answer": "org_northbridge_logistics",
870
+ "metadata": {
871
+ "difficulty": "mid",
872
+ "difficulty_level": 2,
873
+ "question_id": "mid_03"
874
+ },
875
+ "question": "Which organization operates in the location referenced by post_midnight_manifest?",
876
+ "supporting_edges": [
877
+ {
878
+ "confidence": 1.0,
879
+ "dst": "loc_dockyard17",
880
+ "rel": "references",
881
+ "src": "post_midnight_manifest"
882
+ },
883
+ {
884
+ "confidence": 1.0,
885
+ "dst": "loc_dockyard17",
886
+ "rel": "operates_in",
887
+ "src": "org_northbridge_logistics"
888
+ }
889
+ ],
890
+ "task_id": "fixed_task_07",
891
+ "task_type": "cross_platform_linking"
892
+ },
893
+ {
894
+ "answer": "user_bharat",
895
+ "metadata": {
896
+ "difficulty": "mid",
897
+ "difficulty_level": 2,
898
+ "question_id": "mid_04"
899
+ },
900
+ "question": "user_ivy is directly connected to which collaborator on event_project_lantern?",
901
+ "supporting_edges": [
902
+ {
903
+ "confidence": 1.0,
904
+ "dst": "user_bharat",
905
+ "rel": "connected_to",
906
+ "src": "user_ivy"
907
+ },
908
+ {
909
+ "confidence": 1.0,
910
+ "dst": "event_project_lantern",
911
+ "rel": "collaborates_on",
912
+ "src": "user_bharat"
913
+ }
914
+ ],
915
+ "task_id": "fixed_task_08",
916
+ "task_type": "network_discovery"
917
+ },
918
+ {
919
+ "answer": "user_hiro",
920
+ "metadata": {
921
+ "difficulty": "mid",
922
+ "difficulty_level": 2,
923
+ "question_id": "mid_05"
924
+ },
925
+ "question": "Which canonical user is behind alias_docksparrow and collaborates on event_project_lantern?",
926
+ "supporting_edges": [
927
+ {
928
+ "confidence": 1.0,
929
+ "dst": "user_hiro",
930
+ "rel": "alias_of",
931
+ "src": "alias_docksparrow"
932
+ },
933
+ {
934
+ "confidence": 1.0,
935
+ "dst": "event_project_lantern",
936
+ "rel": "collaborates_on",
937
+ "src": "user_hiro"
938
+ }
939
+ ],
940
+ "task_id": "fixed_task_09",
941
+ "task_type": "deanonymization"
942
+ },
943
+ {
944
+ "answer": "user_bharat",
945
+ "metadata": {
946
+ "difficulty": "high",
947
+ "difficulty_level": 3,
948
+ "question_id": "high_01"
949
+ },
950
+ "question": "An alias authored post_midnight_manifest referencing loc_dockyard17; through a direct connection from that alias owner, which user collaborates on event_project_lantern?",
951
+ "supporting_edges": [
952
+ {
953
+ "confidence": 1.0,
954
+ "dst": "user_ivy",
955
+ "rel": "alias_of",
956
+ "src": "alias_orchidfox"
957
+ },
958
+ {
959
+ "confidence": 1.0,
960
+ "dst": "post_midnight_manifest",
961
+ "rel": "authored_post",
962
+ "src": "alias_orchidfox"
963
+ },
964
+ {
965
+ "confidence": 1.0,
966
+ "dst": "loc_dockyard17",
967
+ "rel": "references",
968
+ "src": "post_midnight_manifest"
969
+ },
970
+ {
971
+ "confidence": 1.0,
972
+ "dst": "user_bharat",
973
+ "rel": "connected_to",
974
+ "src": "user_ivy"
975
+ },
976
+ {
977
+ "confidence": 1.0,
978
+ "dst": "event_project_lantern",
979
+ "rel": "collaborates_on",
980
+ "src": "user_bharat"
981
+ }
982
+ ],
983
+ "task_id": "fixed_task_10",
984
+ "task_type": "convoluted_trace"
985
+ },
986
+ {
987
+ "answer": "user_hiro",
988
+ "metadata": {
989
+ "difficulty": "high",
990
+ "difficulty_level": 3,
991
+ "question_id": "high_02"
992
+ },
993
+ "question": "Thread thr_supply_leak references org_northbridge_logistics. Identify the canonical user behind alias_docksparrow who works there and collaborates on event_project_lantern.",
994
+ "supporting_edges": [
995
+ {
996
+ "confidence": 1.0,
997
+ "dst": "org_northbridge_logistics",
998
+ "rel": "references",
999
+ "src": "thr_supply_leak"
1000
+ },
1001
+ {
1002
+ "confidence": 1.0,
1003
+ "dst": "user_hiro",
1004
+ "rel": "alias_of",
1005
+ "src": "alias_docksparrow"
1006
+ },
1007
+ {
1008
+ "confidence": 1.0,
1009
+ "dst": "org_northbridge_logistics",
1010
+ "rel": "works_at",
1011
+ "src": "user_hiro"
1012
+ },
1013
+ {
1014
+ "confidence": 1.0,
1015
+ "dst": "event_project_lantern",
1016
+ "rel": "collaborates_on",
1017
+ "src": "user_hiro"
1018
+ }
1019
+ ],
1020
+ "task_id": "fixed_task_11",
1021
+ "task_type": "convoluted_trace"
1022
+ },
1023
+ {
1024
+ "answer": "user_diya",
1025
+ "metadata": {
1026
+ "difficulty": "high",
1027
+ "difficulty_level": 3,
1028
+ "question_id": "high_03"
1029
+ },
1030
+ "question": "Cross-platform Black Kite linkage: alias_monsoonbyte authored post_drone_parts referencing event_black_kite. Which canonical user behind that alias is directly connected to the Kestrel Works collaborator on the same event?",
1031
+ "supporting_edges": [
1032
+ {
1033
+ "confidence": 1.0,
1034
+ "dst": "user_diya",
1035
+ "rel": "alias_of",
1036
+ "src": "alias_monsoonbyte"
1037
+ },
1038
+ {
1039
+ "confidence": 1.0,
1040
+ "dst": "post_drone_parts",
1041
+ "rel": "authored_post",
1042
+ "src": "alias_monsoonbyte"
1043
+ },
1044
+ {
1045
+ "confidence": 1.0,
1046
+ "dst": "event_black_kite",
1047
+ "rel": "references",
1048
+ "src": "post_drone_parts"
1049
+ },
1050
+ {
1051
+ "confidence": 1.0,
1052
+ "dst": "org_kestrel_works",
1053
+ "rel": "works_at",
1054
+ "src": "user_ivy"
1055
+ },
1056
+ {
1057
+ "confidence": 1.0,
1058
+ "dst": "event_black_kite",
1059
+ "rel": "collaborates_on",
1060
+ "src": "user_ivy"
1061
+ },
1062
+ {
1063
+ "confidence": 1.0,
1064
+ "dst": "user_ivy",
1065
+ "rel": "connected_to",
1066
+ "src": "user_diya"
1067
+ }
1068
+ ],
1069
+ "task_id": "fixed_task_12",
1070
+ "task_type": "convoluted_trace"
1071
+ },
1072
+ {
1073
+ "answer": "user_faris",
1074
+ "metadata": {
1075
+ "difficulty": "high",
1076
+ "difficulty_level": 3,
1077
+ "question_id": "high_04"
1078
+ },
1079
+ "question": "A sat-phone ping alias references loc_rivergate. Which canonical user behind alias_nightrelay works at an organization operating there and collaborates on event_project_lantern?",
1080
+ "supporting_edges": [
1081
+ {
1082
+ "confidence": 1.0,
1083
+ "dst": "user_faris",
1084
+ "rel": "alias_of",
1085
+ "src": "alias_nightrelay"
1086
+ },
1087
+ {
1088
+ "confidence": 1.0,
1089
+ "dst": "post_sat_phone_ping",
1090
+ "rel": "authored_post",
1091
+ "src": "alias_nightrelay"
1092
+ },
1093
+ {
1094
+ "confidence": 1.0,
1095
+ "dst": "loc_rivergate",
1096
+ "rel": "references",
1097
+ "src": "post_sat_phone_ping"
1098
+ },
1099
+ {
1100
+ "confidence": 1.0,
1101
+ "dst": "org_tidewatch_ops",
1102
+ "rel": "works_at",
1103
+ "src": "user_faris"
1104
+ },
1105
+ {
1106
+ "confidence": 1.0,
1107
+ "dst": "loc_rivergate",
1108
+ "rel": "operates_in",
1109
+ "src": "org_tidewatch_ops"
1110
+ },
1111
+ {
1112
+ "confidence": 1.0,
1113
+ "dst": "event_project_lantern",
1114
+ "rel": "collaborates_on",
1115
+ "src": "user_faris"
1116
+ }
1117
+ ],
1118
+ "task_id": "fixed_task_13",
1119
+ "task_type": "convoluted_trace"
1120
+ },
1121
+ {
1122
+ "answer": "user_ivy",
1123
+ "metadata": {
1124
+ "difficulty": "high",
1125
+ "difficulty_level": 3,
1126
+ "question_id": "high_05"
1127
+ },
1128
+ "question": "From thread thr_port_audit discussing event_black_kite and referencing org_kestrel_works, identify the canonical user whose alias authored post_midnight_manifest and who collaborates on event_black_kite.",
1129
+ "supporting_edges": [
1130
+ {
1131
+ "confidence": 1.0,
1132
+ "dst": "event_black_kite",
1133
+ "rel": "discusses",
1134
+ "src": "thr_port_audit"
1135
+ },
1136
+ {
1137
+ "confidence": 1.0,
1138
+ "dst": "org_kestrel_works",
1139
+ "rel": "references",
1140
+ "src": "thr_port_audit"
1141
+ },
1142
+ {
1143
+ "confidence": 1.0,
1144
+ "dst": "user_ivy",
1145
+ "rel": "alias_of",
1146
+ "src": "alias_orchidfox"
1147
+ },
1148
+ {
1149
+ "confidence": 1.0,
1150
+ "dst": "post_midnight_manifest",
1151
+ "rel": "authored_post",
1152
+ "src": "alias_orchidfox"
1153
+ },
1154
+ {
1155
+ "confidence": 1.0,
1156
+ "dst": "org_kestrel_works",
1157
+ "rel": "works_at",
1158
+ "src": "user_ivy"
1159
+ },
1160
+ {
1161
+ "confidence": 1.0,
1162
+ "dst": "event_black_kite",
1163
+ "rel": "collaborates_on",
1164
+ "src": "user_ivy"
1165
+ }
1166
+ ],
1167
+ "task_id": "fixed_task_14",
1168
+ "task_type": "convoluted_trace"
1169
+ }
1170
+ ],
1171
+ "source_seed": "datasets/fixed_levels/seed_fixed_levels.json"
1172
+ }
datasets/fixed_levels/leaderboard_fixed_levels.json ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "config": {
4
+ "max_agents": 3,
5
+ "max_breadth": 2,
6
+ "max_depth": 2,
7
+ "max_steps": 20,
8
+ "max_width": 2,
9
+ "seed": 2026,
10
+ "seeded_questions": 15,
11
+ "swarm_enabled": true
12
+ },
13
+ "created_at": "2026-04-01T18:48:39+00:00",
14
+ "episodes": 15,
15
+ "metrics": {
16
+ "avg_compactness_reward": 0.0,
17
+ "avg_connectivity_gain_reward": 0.16666666666666666,
18
+ "avg_connectivity_reward": 0.16999999999999998,
19
+ "avg_diversity_reward": 0.1157777777777778,
20
+ "avg_entity_informativeness_reward": -0.08858065677817137,
21
+ "avg_format_reward": 0.14999999999999997,
22
+ "avg_graph_f1": 0.8492063492063492,
23
+ "avg_knowledge_carrier_reward": 0.5,
24
+ "avg_knowledge_indexing_reward": 0.052000000000000005,
25
+ "avg_relation_informativeness_reward": 0.07135858524047924,
26
+ "avg_reward": 4.197526826881651,
27
+ "avg_soft_shaping_reward": 0.24999999999999994,
28
+ "avg_spawn_count": 4.0,
29
+ "avg_spawn_critical_steps": 6.0,
30
+ "avg_steps_to_solution": 9.0,
31
+ "deanonymization_accuracy": 1.0,
32
+ "leaderboard_score": 0.8543934355282199,
33
+ "retrieval_signal": 0.6932,
34
+ "spawn_completion_rate": 1.0,
35
+ "spawn_signal": 0.6666666666666666,
36
+ "structural_signal": 0.5730889190257948,
37
+ "task_success_rate": 1.0,
38
+ "tool_efficiency": 0.5
39
+ },
40
+ "run_id": "run_0001",
41
+ "run_name": "fixed_levels_qwen_swarm"
42
+ }
43
+ ]
datasets/fixed_levels/qwen_swarm_benchmark_fixed_levels.json ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dashboard": "datasets/fixed_levels/dashboard_fixed_levels.html",
3
+ "record": {
4
+ "config": {
5
+ "max_agents": 3,
6
+ "max_breadth": 2,
7
+ "max_depth": 2,
8
+ "max_steps": 20,
9
+ "max_width": 2,
10
+ "seed": 2026,
11
+ "seeded_questions": 15,
12
+ "swarm_enabled": true
13
+ },
14
+ "created_at": "2026-04-01T18:48:39+00:00",
15
+ "episodes": 15,
16
+ "metrics": {
17
+ "avg_compactness_reward": 0.0,
18
+ "avg_connectivity_gain_reward": 0.16666666666666666,
19
+ "avg_connectivity_reward": 0.16999999999999998,
20
+ "avg_diversity_reward": 0.1157777777777778,
21
+ "avg_entity_informativeness_reward": -0.08858065677817137,
22
+ "avg_format_reward": 0.14999999999999997,
23
+ "avg_graph_f1": 0.8492063492063492,
24
+ "avg_knowledge_carrier_reward": 0.5,
25
+ "avg_knowledge_indexing_reward": 0.052000000000000005,
26
+ "avg_relation_informativeness_reward": 0.07135858524047924,
27
+ "avg_reward": 4.197526826881651,
28
+ "avg_soft_shaping_reward": 0.24999999999999994,
29
+ "avg_spawn_count": 4.0,
30
+ "avg_spawn_critical_steps": 6.0,
31
+ "avg_steps_to_solution": 9.0,
32
+ "deanonymization_accuracy": 1.0,
33
+ "leaderboard_score": 0.8543934355282199,
34
+ "retrieval_signal": 0.6932,
35
+ "spawn_completion_rate": 1.0,
36
+ "spawn_signal": 0.6666666666666666,
37
+ "structural_signal": 0.5730889190257948,
38
+ "task_success_rate": 1.0,
39
+ "tool_efficiency": 0.5
40
+ },
41
+ "run_id": "run_0001",
42
+ "run_name": "fixed_levels_qwen_swarm"
43
+ },
44
+ "summary": {
45
+ "avg_compactness_reward": 0.0,
46
+ "avg_connectivity_gain_reward": 0.16666666666666666,
47
+ "avg_connectivity_reward": 0.16999999999999998,
48
+ "avg_diversity_reward": 0.1157777777777778,
49
+ "avg_entity_informativeness_reward": -0.08858065677817137,
50
+ "avg_format_reward": 0.14999999999999997,
51
+ "avg_graph_f1": 0.8492063492063492,
52
+ "avg_knowledge_carrier_reward": 0.5,
53
+ "avg_knowledge_indexing_reward": 0.052000000000000005,
54
+ "avg_relation_informativeness_reward": 0.07135858524047924,
55
+ "avg_reward": 4.197526826881651,
56
+ "avg_soft_shaping_reward": 0.24999999999999994,
57
+ "avg_spawn_count": 4.0,
58
+ "avg_spawn_critical_steps": 6.0,
59
+ "avg_steps_to_solution": 9.0,
60
+ "deanonymization_accuracy": 1.0,
61
+ "leaderboard_score": 0.8543934355282199,
62
+ "retrieval_signal": 0.6932,
63
+ "spawn_completion_rate": 1.0,
64
+ "spawn_signal": 0.6666666666666666,
65
+ "structural_signal": 0.5730889190257948,
66
+ "task_success_rate": 1.0,
67
+ "tool_efficiency": 0.5
68
+ }
69
+ }
datasets/fixed_levels/qwen_swarm_eval_by_difficulty.json ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "by_difficulty": {
3
+ "easy": {
4
+ "avg_graph_f1": 1.0,
5
+ "avg_reward": 3.610490808845623,
6
+ "avg_steps": 9.0,
7
+ "avg_tool_calls": 4.0,
8
+ "episodes": 5,
9
+ "task_success_rate": 1.0
10
+ },
11
+ "high": {
12
+ "avg_graph_f1": 0.5476190476190477,
13
+ "avg_reward": 4.207102815893519,
14
+ "avg_steps": 9.0,
15
+ "avg_tool_calls": 4.0,
16
+ "episodes": 5,
17
+ "task_success_rate": 1.0
18
+ },
19
+ "mid": {
20
+ "avg_graph_f1": 1.0,
21
+ "avg_reward": 4.822687547070801,
22
+ "avg_steps": 9.0,
23
+ "avg_tool_calls": 4.0,
24
+ "episodes": 5,
25
+ "task_success_rate": 1.0
26
+ }
27
+ },
28
+ "overall": {
29
+ "avg_compactness_reward": 0.0,
30
+ "avg_connectivity_gain_reward": 0.16666666666666666,
31
+ "avg_connectivity_reward": 0.16999999999999998,
32
+ "avg_diversity_reward": 0.1157777777777778,
33
+ "avg_entity_informativeness_reward": -0.07289878447762359,
34
+ "avg_format_reward": 0.14999999999999997,
35
+ "avg_graph_f1": 0.8492063492063492,
36
+ "avg_knowledge_carrier_reward": 0.5,
37
+ "avg_knowledge_indexing_reward": 0.052000000000000005,
38
+ "avg_relation_informativeness_reward": 0.07157694332826091,
39
+ "avg_reward": 4.213427057269981,
40
+ "avg_soft_shaping_reward": 0.24999999999999994,
41
+ "avg_spawn_count": 4.0,
42
+ "avg_spawn_critical_steps": 6.0,
43
+ "avg_steps_to_solution": 9.0,
44
+ "deanonymization_accuracy": 1.0,
45
+ "leaderboard_score": 0.8546911504342771,
46
+ "retrieval_signal": 0.6932,
47
+ "spawn_completion_rate": 1.0,
48
+ "spawn_signal": 0.6666666666666666,
49
+ "structural_signal": 0.5762689651034608,
50
+ "task_success_rate": 1.0,
51
+ "tool_efficiency": 0.5
52
+ }
53
+ }
datasets/fixed_levels/qwen_swarm_eval_fixed_levels.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "avg_compactness_reward": 0.0,
3
+ "avg_connectivity_gain_reward": 0.16666666666666666,
4
+ "avg_connectivity_reward": 0.16999999999999998,
5
+ "avg_diversity_reward": 0.1157777777777778,
6
+ "avg_entity_informativeness_reward": -0.02824631570420193,
7
+ "avg_format_reward": 0.14999999999999997,
8
+ "avg_graph_f1": 0.8492063492063492,
9
+ "avg_knowledge_carrier_reward": 0.5,
10
+ "avg_knowledge_indexing_reward": 0.07400000000000001,
11
+ "avg_relation_informativeness_reward": 0.06905976285357758,
12
+ "avg_reward": 4.285384567790942,
13
+ "avg_soft_shaping_reward": 0.24999999999999994,
14
+ "avg_spawn_count": 4.0,
15
+ "avg_spawn_critical_steps": 6.0,
16
+ "avg_steps_to_solution": 9.0,
17
+ "deanonymization_accuracy": 1.0,
18
+ "leaderboard_score": 0.8565775118852701,
19
+ "retrieval_signal": 0.7009000000000001,
20
+ "spawn_completion_rate": 1.0,
21
+ "spawn_signal": 0.6666666666666666,
22
+ "structural_signal": 0.5846960227632085,
23
+ "task_success_rate": 1.0,
24
+ "tool_efficiency": 0.5
25
+ }
datasets/fixed_levels/seed_fixed_levels.json ADDED
@@ -0,0 +1,1113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "seeding": {
3
+ "seeded_nodes": [
4
+ {
5
+ "node_id": "user_aria",
6
+ "node_type": "user",
7
+ "attrs": {
8
+ "name": "Aria Sen",
9
+ "org": "Helios Labs",
10
+ "location": "Sector 9"
11
+ }
12
+ },
13
+ {
14
+ "node_id": "user_bharat",
15
+ "node_type": "user",
16
+ "attrs": {
17
+ "name": "Bharat Kulkarni",
18
+ "org": "Northbridge Logistics",
19
+ "location": "Dockyard 17"
20
+ }
21
+ },
22
+ {
23
+ "node_id": "user_cyrus",
24
+ "node_type": "user",
25
+ "attrs": {
26
+ "name": "Cyrus Mehta",
27
+ "org": "Apex Dynamics",
28
+ "location": "Old Town"
29
+ }
30
+ },
31
+ {
32
+ "node_id": "user_diya",
33
+ "node_type": "user",
34
+ "attrs": {
35
+ "name": "Diya Roy",
36
+ "org": "Blueharbor Media",
37
+ "location": "Old Town"
38
+ }
39
+ },
40
+ {
41
+ "node_id": "user_elin",
42
+ "node_type": "user",
43
+ "attrs": {
44
+ "name": "Elin Das",
45
+ "org": "Helios Labs",
46
+ "location": "Sector 9"
47
+ }
48
+ },
49
+ {
50
+ "node_id": "user_faris",
51
+ "node_type": "user",
52
+ "attrs": {
53
+ "name": "Faris Noor",
54
+ "org": "Tidewatch Ops",
55
+ "location": "Rivergate"
56
+ }
57
+ },
58
+ {
59
+ "node_id": "user_gita",
60
+ "node_type": "user",
61
+ "attrs": {
62
+ "name": "Gita Pradhan",
63
+ "org": "Apex Dynamics",
64
+ "location": "Old Town"
65
+ }
66
+ },
67
+ {
68
+ "node_id": "user_hiro",
69
+ "node_type": "user",
70
+ "attrs": {
71
+ "name": "Hiro Tan",
72
+ "org": "Northbridge Logistics",
73
+ "location": "Dockyard 17"
74
+ }
75
+ },
76
+ {
77
+ "node_id": "user_ivy",
78
+ "node_type": "user",
79
+ "attrs": {
80
+ "name": "Ivy Kapoor",
81
+ "org": "Kestrel Works",
82
+ "location": "Rivergate"
83
+ }
84
+ },
85
+ {
86
+ "node_id": "user_jules",
87
+ "node_type": "user",
88
+ "attrs": {
89
+ "name": "Jules Banerjee",
90
+ "org": "Blueharbor Media",
91
+ "location": "Old Town"
92
+ }
93
+ },
94
+ {
95
+ "node_id": "alias_orchidfox",
96
+ "node_type": "alias",
97
+ "attrs": {
98
+ "handle": "@orchidfox"
99
+ }
100
+ },
101
+ {
102
+ "node_id": "alias_steelquill",
103
+ "node_type": "alias",
104
+ "attrs": {
105
+ "handle": "@steelquill"
106
+ }
107
+ },
108
+ {
109
+ "node_id": "alias_monsoonbyte",
110
+ "node_type": "alias",
111
+ "attrs": {
112
+ "handle": "@monsoonbyte"
113
+ }
114
+ },
115
+ {
116
+ "node_id": "alias_nightrelay",
117
+ "node_type": "alias",
118
+ "attrs": {
119
+ "handle": "@nightrelay"
120
+ }
121
+ },
122
+ {
123
+ "node_id": "alias_mapleghost",
124
+ "node_type": "alias",
125
+ "attrs": {
126
+ "handle": "@mapleghost"
127
+ }
128
+ },
129
+ {
130
+ "node_id": "alias_docksparrow",
131
+ "node_type": "alias",
132
+ "attrs": {
133
+ "handle": "@docksparrow"
134
+ }
135
+ },
136
+ {
137
+ "node_id": "alias_quartzlotus",
138
+ "node_type": "alias",
139
+ "attrs": {
140
+ "handle": "@quartzlotus"
141
+ }
142
+ },
143
+ {
144
+ "node_id": "org_helios_labs",
145
+ "node_type": "org",
146
+ "attrs": {
147
+ "name": "Helios Labs"
148
+ }
149
+ },
150
+ {
151
+ "node_id": "org_northbridge_logistics",
152
+ "node_type": "org",
153
+ "attrs": {
154
+ "name": "Northbridge Logistics"
155
+ }
156
+ },
157
+ {
158
+ "node_id": "org_apex_dynamics",
159
+ "node_type": "org",
160
+ "attrs": {
161
+ "name": "Apex Dynamics"
162
+ }
163
+ },
164
+ {
165
+ "node_id": "org_blueharbor_media",
166
+ "node_type": "org",
167
+ "attrs": {
168
+ "name": "Blueharbor Media"
169
+ }
170
+ },
171
+ {
172
+ "node_id": "org_tidewatch_ops",
173
+ "node_type": "org",
174
+ "attrs": {
175
+ "name": "Tidewatch Ops"
176
+ }
177
+ },
178
+ {
179
+ "node_id": "org_kestrel_works",
180
+ "node_type": "org",
181
+ "attrs": {
182
+ "name": "Kestrel Works"
183
+ }
184
+ },
185
+ {
186
+ "node_id": "loc_dockyard17",
187
+ "node_type": "location",
188
+ "attrs": {
189
+ "name": "Dockyard 17"
190
+ }
191
+ },
192
+ {
193
+ "node_id": "loc_sector9",
194
+ "node_type": "location",
195
+ "attrs": {
196
+ "name": "Sector 9"
197
+ }
198
+ },
199
+ {
200
+ "node_id": "loc_old_town",
201
+ "node_type": "location",
202
+ "attrs": {
203
+ "name": "Old Town"
204
+ }
205
+ },
206
+ {
207
+ "node_id": "loc_rivergate",
208
+ "node_type": "location",
209
+ "attrs": {
210
+ "name": "Rivergate"
211
+ }
212
+ },
213
+ {
214
+ "node_id": "event_project_lantern",
215
+ "node_type": "event",
216
+ "attrs": {
217
+ "name": "Project Lantern"
218
+ }
219
+ },
220
+ {
221
+ "node_id": "event_black_kite",
222
+ "node_type": "event",
223
+ "attrs": {
224
+ "name": "Black Kite"
225
+ }
226
+ },
227
+ {
228
+ "node_id": "event_silent_current",
229
+ "node_type": "event",
230
+ "attrs": {
231
+ "name": "Silent Current"
232
+ }
233
+ },
234
+ {
235
+ "node_id": "thr_supply_leak",
236
+ "node_type": "thread",
237
+ "attrs": {
238
+ "topic": "supply_chain"
239
+ }
240
+ },
241
+ {
242
+ "node_id": "thr_port_audit",
243
+ "node_type": "thread",
244
+ "attrs": {
245
+ "topic": "port_audit"
246
+ }
247
+ },
248
+ {
249
+ "node_id": "post_shift_roster",
250
+ "node_type": "post",
251
+ "attrs": {
252
+ "channel": "microblog"
253
+ }
254
+ },
255
+ {
256
+ "node_id": "post_midnight_manifest",
257
+ "node_type": "post",
258
+ "attrs": {
259
+ "channel": "microblog"
260
+ }
261
+ },
262
+ {
263
+ "node_id": "post_sat_phone_ping",
264
+ "node_type": "post",
265
+ "attrs": {
266
+ "channel": "microblog"
267
+ }
268
+ },
269
+ {
270
+ "node_id": "post_drone_parts",
271
+ "node_type": "post",
272
+ "attrs": {
273
+ "channel": "microblog"
274
+ }
275
+ },
276
+ {
277
+ "node_id": "post_relay_schedule",
278
+ "node_type": "post",
279
+ "attrs": {
280
+ "channel": "microblog"
281
+ }
282
+ }
283
+ ],
284
+ "seeded_edges": [
285
+ {
286
+ "src": "alias_orchidfox",
287
+ "rel": "alias_of",
288
+ "dst": "user_ivy",
289
+ "confidence": 1.0
290
+ },
291
+ {
292
+ "src": "alias_steelquill",
293
+ "rel": "alias_of",
294
+ "dst": "user_bharat",
295
+ "confidence": 1.0
296
+ },
297
+ {
298
+ "src": "alias_monsoonbyte",
299
+ "rel": "alias_of",
300
+ "dst": "user_diya",
301
+ "confidence": 1.0
302
+ },
303
+ {
304
+ "src": "alias_nightrelay",
305
+ "rel": "alias_of",
306
+ "dst": "user_faris",
307
+ "confidence": 1.0
308
+ },
309
+ {
310
+ "src": "alias_mapleghost",
311
+ "rel": "alias_of",
312
+ "dst": "user_elin",
313
+ "confidence": 1.0
314
+ },
315
+ {
316
+ "src": "alias_docksparrow",
317
+ "rel": "alias_of",
318
+ "dst": "user_hiro",
319
+ "confidence": 1.0
320
+ },
321
+ {
322
+ "src": "alias_quartzlotus",
323
+ "rel": "alias_of",
324
+ "dst": "user_cyrus",
325
+ "confidence": 1.0
326
+ },
327
+ {
328
+ "src": "user_aria",
329
+ "rel": "works_at",
330
+ "dst": "org_helios_labs",
331
+ "confidence": 1.0
332
+ },
333
+ {
334
+ "src": "user_bharat",
335
+ "rel": "works_at",
336
+ "dst": "org_northbridge_logistics",
337
+ "confidence": 1.0
338
+ },
339
+ {
340
+ "src": "user_cyrus",
341
+ "rel": "works_at",
342
+ "dst": "org_apex_dynamics",
343
+ "confidence": 1.0
344
+ },
345
+ {
346
+ "src": "user_diya",
347
+ "rel": "works_at",
348
+ "dst": "org_blueharbor_media",
349
+ "confidence": 1.0
350
+ },
351
+ {
352
+ "src": "user_elin",
353
+ "rel": "works_at",
354
+ "dst": "org_helios_labs",
355
+ "confidence": 1.0
356
+ },
357
+ {
358
+ "src": "user_faris",
359
+ "rel": "works_at",
360
+ "dst": "org_tidewatch_ops",
361
+ "confidence": 1.0
362
+ },
363
+ {
364
+ "src": "user_gita",
365
+ "rel": "works_at",
366
+ "dst": "org_apex_dynamics",
367
+ "confidence": 1.0
368
+ },
369
+ {
370
+ "src": "user_hiro",
371
+ "rel": "works_at",
372
+ "dst": "org_northbridge_logistics",
373
+ "confidence": 1.0
374
+ },
375
+ {
376
+ "src": "user_ivy",
377
+ "rel": "works_at",
378
+ "dst": "org_kestrel_works",
379
+ "confidence": 1.0
380
+ },
381
+ {
382
+ "src": "user_jules",
383
+ "rel": "works_at",
384
+ "dst": "org_blueharbor_media",
385
+ "confidence": 1.0
386
+ },
387
+ {
388
+ "src": "user_aria",
389
+ "rel": "located_in",
390
+ "dst": "loc_sector9",
391
+ "confidence": 1.0
392
+ },
393
+ {
394
+ "src": "user_bharat",
395
+ "rel": "located_in",
396
+ "dst": "loc_dockyard17",
397
+ "confidence": 1.0
398
+ },
399
+ {
400
+ "src": "user_cyrus",
401
+ "rel": "located_in",
402
+ "dst": "loc_old_town",
403
+ "confidence": 1.0
404
+ },
405
+ {
406
+ "src": "user_diya",
407
+ "rel": "located_in",
408
+ "dst": "loc_old_town",
409
+ "confidence": 1.0
410
+ },
411
+ {
412
+ "src": "user_elin",
413
+ "rel": "located_in",
414
+ "dst": "loc_sector9",
415
+ "confidence": 1.0
416
+ },
417
+ {
418
+ "src": "user_faris",
419
+ "rel": "located_in",
420
+ "dst": "loc_rivergate",
421
+ "confidence": 1.0
422
+ },
423
+ {
424
+ "src": "user_gita",
425
+ "rel": "located_in",
426
+ "dst": "loc_old_town",
427
+ "confidence": 1.0
428
+ },
429
+ {
430
+ "src": "user_hiro",
431
+ "rel": "located_in",
432
+ "dst": "loc_dockyard17",
433
+ "confidence": 1.0
434
+ },
435
+ {
436
+ "src": "user_ivy",
437
+ "rel": "located_in",
438
+ "dst": "loc_rivergate",
439
+ "confidence": 1.0
440
+ },
441
+ {
442
+ "src": "user_jules",
443
+ "rel": "located_in",
444
+ "dst": "loc_old_town",
445
+ "confidence": 1.0
446
+ },
447
+ {
448
+ "src": "org_helios_labs",
449
+ "rel": "operates_in",
450
+ "dst": "loc_sector9",
451
+ "confidence": 1.0
452
+ },
453
+ {
454
+ "src": "org_northbridge_logistics",
455
+ "rel": "operates_in",
456
+ "dst": "loc_dockyard17",
457
+ "confidence": 1.0
458
+ },
459
+ {
460
+ "src": "org_apex_dynamics",
461
+ "rel": "operates_in",
462
+ "dst": "loc_old_town",
463
+ "confidence": 1.0
464
+ },
465
+ {
466
+ "src": "org_blueharbor_media",
467
+ "rel": "operates_in",
468
+ "dst": "loc_old_town",
469
+ "confidence": 1.0
470
+ },
471
+ {
472
+ "src": "org_tidewatch_ops",
473
+ "rel": "operates_in",
474
+ "dst": "loc_rivergate",
475
+ "confidence": 1.0
476
+ },
477
+ {
478
+ "src": "org_kestrel_works",
479
+ "rel": "operates_in",
480
+ "dst": "loc_rivergate",
481
+ "confidence": 1.0
482
+ },
483
+ {
484
+ "src": "user_ivy",
485
+ "rel": "connected_to",
486
+ "dst": "user_bharat",
487
+ "confidence": 0.95
488
+ },
489
+ {
490
+ "src": "user_bharat",
491
+ "rel": "connected_to",
492
+ "dst": "user_hiro",
493
+ "confidence": 0.95
494
+ },
495
+ {
496
+ "src": "user_hiro",
497
+ "rel": "connected_to",
498
+ "dst": "user_faris",
499
+ "confidence": 0.9
500
+ },
501
+ {
502
+ "src": "user_faris",
503
+ "rel": "connected_to",
504
+ "dst": "user_diya",
505
+ "confidence": 0.9
506
+ },
507
+ {
508
+ "src": "user_diya",
509
+ "rel": "connected_to",
510
+ "dst": "user_elin",
511
+ "confidence": 0.88
512
+ },
513
+ {
514
+ "src": "user_elin",
515
+ "rel": "connected_to",
516
+ "dst": "user_aria",
517
+ "confidence": 0.85
518
+ },
519
+ {
520
+ "src": "user_aria",
521
+ "rel": "connected_to",
522
+ "dst": "user_cyrus",
523
+ "confidence": 0.82
524
+ },
525
+ {
526
+ "src": "user_cyrus",
527
+ "rel": "connected_to",
528
+ "dst": "user_gita",
529
+ "confidence": 0.82
530
+ },
531
+ {
532
+ "src": "user_gita",
533
+ "rel": "connected_to",
534
+ "dst": "user_jules",
535
+ "confidence": 0.8
536
+ },
537
+ {
538
+ "src": "user_jules",
539
+ "rel": "connected_to",
540
+ "dst": "user_bharat",
541
+ "confidence": 0.8
542
+ },
543
+ {
544
+ "src": "user_diya",
545
+ "rel": "connected_to",
546
+ "dst": "user_ivy",
547
+ "confidence": 0.9
548
+ },
549
+ {
550
+ "src": "user_ivy",
551
+ "rel": "connected_to",
552
+ "dst": "user_elin",
553
+ "confidence": 0.86
554
+ },
555
+ {
556
+ "src": "alias_orchidfox",
557
+ "rel": "authored_post",
558
+ "dst": "post_midnight_manifest",
559
+ "confidence": 1.0
560
+ },
561
+ {
562
+ "src": "alias_docksparrow",
563
+ "rel": "authored_post",
564
+ "dst": "post_shift_roster",
565
+ "confidence": 1.0
566
+ },
567
+ {
568
+ "src": "alias_nightrelay",
569
+ "rel": "authored_post",
570
+ "dst": "post_sat_phone_ping",
571
+ "confidence": 1.0
572
+ },
573
+ {
574
+ "src": "alias_monsoonbyte",
575
+ "rel": "authored_post",
576
+ "dst": "post_drone_parts",
577
+ "confidence": 1.0
578
+ },
579
+ {
580
+ "src": "alias_steelquill",
581
+ "rel": "authored_post",
582
+ "dst": "post_relay_schedule",
583
+ "confidence": 1.0
584
+ },
585
+ {
586
+ "src": "post_midnight_manifest",
587
+ "rel": "references",
588
+ "dst": "loc_dockyard17",
589
+ "confidence": 1.0
590
+ },
591
+ {
592
+ "src": "post_midnight_manifest",
593
+ "rel": "references",
594
+ "dst": "event_project_lantern",
595
+ "confidence": 1.0
596
+ },
597
+ {
598
+ "src": "post_shift_roster",
599
+ "rel": "references",
600
+ "dst": "loc_dockyard17",
601
+ "confidence": 1.0
602
+ },
603
+ {
604
+ "src": "post_sat_phone_ping",
605
+ "rel": "references",
606
+ "dst": "loc_rivergate",
607
+ "confidence": 1.0
608
+ },
609
+ {
610
+ "src": "post_drone_parts",
611
+ "rel": "references",
612
+ "dst": "event_black_kite",
613
+ "confidence": 1.0
614
+ },
615
+ {
616
+ "src": "post_relay_schedule",
617
+ "rel": "references",
618
+ "dst": "event_project_lantern",
619
+ "confidence": 1.0
620
+ },
621
+ {
622
+ "src": "user_diya",
623
+ "rel": "authored_thread",
624
+ "dst": "thr_supply_leak",
625
+ "confidence": 1.0
626
+ },
627
+ {
628
+ "src": "user_jules",
629
+ "rel": "authored_thread",
630
+ "dst": "thr_port_audit",
631
+ "confidence": 1.0
632
+ },
633
+ {
634
+ "src": "thr_supply_leak",
635
+ "rel": "discusses",
636
+ "dst": "event_project_lantern",
637
+ "confidence": 1.0
638
+ },
639
+ {
640
+ "src": "thr_supply_leak",
641
+ "rel": "references",
642
+ "dst": "org_northbridge_logistics",
643
+ "confidence": 1.0
644
+ },
645
+ {
646
+ "src": "thr_port_audit",
647
+ "rel": "discusses",
648
+ "dst": "event_black_kite",
649
+ "confidence": 1.0
650
+ },
651
+ {
652
+ "src": "thr_port_audit",
653
+ "rel": "references",
654
+ "dst": "org_kestrel_works",
655
+ "confidence": 1.0
656
+ },
657
+ {
658
+ "src": "user_bharat",
659
+ "rel": "collaborates_on",
660
+ "dst": "event_project_lantern",
661
+ "confidence": 0.95
662
+ },
663
+ {
664
+ "src": "user_hiro",
665
+ "rel": "collaborates_on",
666
+ "dst": "event_project_lantern",
667
+ "confidence": 0.95
668
+ },
669
+ {
670
+ "src": "user_faris",
671
+ "rel": "collaborates_on",
672
+ "dst": "event_project_lantern",
673
+ "confidence": 0.9
674
+ },
675
+ {
676
+ "src": "user_diya",
677
+ "rel": "investigates",
678
+ "dst": "event_project_lantern",
679
+ "confidence": 0.9
680
+ },
681
+ {
682
+ "src": "user_ivy",
683
+ "rel": "collaborates_on",
684
+ "dst": "event_black_kite",
685
+ "confidence": 0.94
686
+ },
687
+ {
688
+ "src": "user_cyrus",
689
+ "rel": "collaborates_on",
690
+ "dst": "event_black_kite",
691
+ "confidence": 0.9
692
+ },
693
+ {
694
+ "src": "user_elin",
695
+ "rel": "investigates",
696
+ "dst": "event_black_kite",
697
+ "confidence": 0.88
698
+ },
699
+ {
700
+ "src": "user_gita",
701
+ "rel": "monitors",
702
+ "dst": "event_silent_current",
703
+ "confidence": 0.86
704
+ },
705
+ {
706
+ "src": "user_jules",
707
+ "rel": "reports_on",
708
+ "dst": "event_silent_current",
709
+ "confidence": 0.86
710
+ }
711
+ ],
712
+ "seeded_questions": [
713
+ {
714
+ "task_type": "identity_resolution",
715
+ "question": "Which canonical user owns alias alias_orchidfox?",
716
+ "answer": "user_ivy",
717
+ "supporting_edges": [
718
+ {
719
+ "src": "alias_orchidfox",
720
+ "rel": "alias_of",
721
+ "dst": "user_ivy"
722
+ }
723
+ ],
724
+ "metadata": {
725
+ "difficulty": "easy",
726
+ "difficulty_level": 1,
727
+ "question_id": "easy_01"
728
+ }
729
+ },
730
+ {
731
+ "task_type": "entity_lookup",
732
+ "question": "Which organization does user_bharat work at?",
733
+ "answer": "org_northbridge_logistics",
734
+ "supporting_edges": [
735
+ {
736
+ "src": "user_bharat",
737
+ "rel": "works_at",
738
+ "dst": "org_northbridge_logistics"
739
+ }
740
+ ],
741
+ "metadata": {
742
+ "difficulty": "easy",
743
+ "difficulty_level": 1,
744
+ "question_id": "easy_02"
745
+ }
746
+ },
747
+ {
748
+ "task_type": "network_discovery",
749
+ "question": "Who is directly connected to user_ivy in the logistics chain?",
750
+ "answer": "user_bharat",
751
+ "supporting_edges": [
752
+ {
753
+ "src": "user_ivy",
754
+ "rel": "connected_to",
755
+ "dst": "user_bharat"
756
+ }
757
+ ],
758
+ "metadata": {
759
+ "difficulty": "easy",
760
+ "difficulty_level": 1,
761
+ "question_id": "easy_03"
762
+ }
763
+ },
764
+ {
765
+ "task_type": "event_tracing",
766
+ "question": "Which event is discussed in thread thr_supply_leak?",
767
+ "answer": "event_project_lantern",
768
+ "supporting_edges": [
769
+ {
770
+ "src": "thr_supply_leak",
771
+ "rel": "discusses",
772
+ "dst": "event_project_lantern"
773
+ }
774
+ ],
775
+ "metadata": {
776
+ "difficulty": "easy",
777
+ "difficulty_level": 1,
778
+ "question_id": "easy_04"
779
+ }
780
+ },
781
+ {
782
+ "task_type": "location_trace",
783
+ "question": "Which location is referenced by post post_shift_roster?",
784
+ "answer": "loc_dockyard17",
785
+ "supporting_edges": [
786
+ {
787
+ "src": "post_shift_roster",
788
+ "rel": "references",
789
+ "dst": "loc_dockyard17"
790
+ }
791
+ ],
792
+ "metadata": {
793
+ "difficulty": "easy",
794
+ "difficulty_level": 1,
795
+ "question_id": "easy_05"
796
+ }
797
+ },
798
+ {
799
+ "task_type": "identity_resolution",
800
+ "question": "Alias alias_monsoonbyte belongs to which user and where does that user work? Return only the organization node id.",
801
+ "answer": "org_blueharbor_media",
802
+ "supporting_edges": [
803
+ {
804
+ "src": "alias_monsoonbyte",
805
+ "rel": "alias_of",
806
+ "dst": "user_diya"
807
+ },
808
+ {
809
+ "src": "user_diya",
810
+ "rel": "works_at",
811
+ "dst": "org_blueharbor_media"
812
+ }
813
+ ],
814
+ "metadata": {
815
+ "difficulty": "mid",
816
+ "difficulty_level": 2,
817
+ "question_id": "mid_01"
818
+ }
819
+ },
820
+ {
821
+ "task_type": "event_tracing",
822
+ "question": "Which user both authored thread thr_supply_leak and investigates event_project_lantern?",
823
+ "answer": "user_diya",
824
+ "supporting_edges": [
825
+ {
826
+ "src": "user_diya",
827
+ "rel": "authored_thread",
828
+ "dst": "thr_supply_leak"
829
+ },
830
+ {
831
+ "src": "user_diya",
832
+ "rel": "investigates",
833
+ "dst": "event_project_lantern"
834
+ }
835
+ ],
836
+ "metadata": {
837
+ "difficulty": "mid",
838
+ "difficulty_level": 2,
839
+ "question_id": "mid_02"
840
+ }
841
+ },
842
+ {
843
+ "task_type": "cross_platform_linking",
844
+ "question": "Which organization operates in the location referenced by post_midnight_manifest?",
845
+ "answer": "org_northbridge_logistics",
846
+ "supporting_edges": [
847
+ {
848
+ "src": "post_midnight_manifest",
849
+ "rel": "references",
850
+ "dst": "loc_dockyard17"
851
+ },
852
+ {
853
+ "src": "org_northbridge_logistics",
854
+ "rel": "operates_in",
855
+ "dst": "loc_dockyard17"
856
+ }
857
+ ],
858
+ "metadata": {
859
+ "difficulty": "mid",
860
+ "difficulty_level": 2,
861
+ "question_id": "mid_03"
862
+ }
863
+ },
864
+ {
865
+ "task_type": "network_discovery",
866
+ "question": "user_ivy is directly connected to which collaborator on event_project_lantern?",
867
+ "answer": "user_bharat",
868
+ "supporting_edges": [
869
+ {
870
+ "src": "user_ivy",
871
+ "rel": "connected_to",
872
+ "dst": "user_bharat"
873
+ },
874
+ {
875
+ "src": "user_bharat",
876
+ "rel": "collaborates_on",
877
+ "dst": "event_project_lantern"
878
+ }
879
+ ],
880
+ "metadata": {
881
+ "difficulty": "mid",
882
+ "difficulty_level": 2,
883
+ "question_id": "mid_04"
884
+ }
885
+ },
886
+ {
887
+ "task_type": "deanonymization",
888
+ "question": "Which canonical user is behind alias_docksparrow and collaborates on event_project_lantern?",
889
+ "answer": "user_hiro",
890
+ "supporting_edges": [
891
+ {
892
+ "src": "alias_docksparrow",
893
+ "rel": "alias_of",
894
+ "dst": "user_hiro"
895
+ },
896
+ {
897
+ "src": "user_hiro",
898
+ "rel": "collaborates_on",
899
+ "dst": "event_project_lantern"
900
+ }
901
+ ],
902
+ "metadata": {
903
+ "difficulty": "mid",
904
+ "difficulty_level": 2,
905
+ "question_id": "mid_05"
906
+ }
907
+ },
908
+ {
909
+ "task_type": "convoluted_trace",
910
+ "question": "An alias authored post_midnight_manifest referencing loc_dockyard17; through a direct connection from that alias owner, which user collaborates on event_project_lantern?",
911
+ "answer": "user_bharat",
912
+ "supporting_edges": [
913
+ {
914
+ "src": "alias_orchidfox",
915
+ "rel": "alias_of",
916
+ "dst": "user_ivy"
917
+ },
918
+ {
919
+ "src": "alias_orchidfox",
920
+ "rel": "authored_post",
921
+ "dst": "post_midnight_manifest"
922
+ },
923
+ {
924
+ "src": "post_midnight_manifest",
925
+ "rel": "references",
926
+ "dst": "loc_dockyard17"
927
+ },
928
+ {
929
+ "src": "user_ivy",
930
+ "rel": "connected_to",
931
+ "dst": "user_bharat"
932
+ },
933
+ {
934
+ "src": "user_bharat",
935
+ "rel": "collaborates_on",
936
+ "dst": "event_project_lantern"
937
+ }
938
+ ],
939
+ "metadata": {
940
+ "difficulty": "high",
941
+ "difficulty_level": 3,
942
+ "question_id": "high_01"
943
+ }
944
+ },
945
+ {
946
+ "task_type": "convoluted_trace",
947
+ "question": "Thread thr_supply_leak references org_northbridge_logistics. Identify the canonical user behind alias_docksparrow who works there and collaborates on event_project_lantern.",
948
+ "answer": "user_hiro",
949
+ "supporting_edges": [
950
+ {
951
+ "src": "thr_supply_leak",
952
+ "rel": "references",
953
+ "dst": "org_northbridge_logistics"
954
+ },
955
+ {
956
+ "src": "alias_docksparrow",
957
+ "rel": "alias_of",
958
+ "dst": "user_hiro"
959
+ },
960
+ {
961
+ "src": "user_hiro",
962
+ "rel": "works_at",
963
+ "dst": "org_northbridge_logistics"
964
+ },
965
+ {
966
+ "src": "user_hiro",
967
+ "rel": "collaborates_on",
968
+ "dst": "event_project_lantern"
969
+ }
970
+ ],
971
+ "metadata": {
972
+ "difficulty": "high",
973
+ "difficulty_level": 3,
974
+ "question_id": "high_02"
975
+ }
976
+ },
977
+ {
978
+ "task_type": "convoluted_trace",
979
+ "question": "Cross-platform Black Kite linkage: alias_monsoonbyte authored post_drone_parts referencing event_black_kite. Which canonical user behind that alias is directly connected to the Kestrel Works collaborator on the same event?",
980
+ "answer": "user_diya",
981
+ "supporting_edges": [
982
+ {
983
+ "src": "alias_monsoonbyte",
984
+ "rel": "alias_of",
985
+ "dst": "user_diya"
986
+ },
987
+ {
988
+ "src": "alias_monsoonbyte",
989
+ "rel": "authored_post",
990
+ "dst": "post_drone_parts"
991
+ },
992
+ {
993
+ "src": "post_drone_parts",
994
+ "rel": "references",
995
+ "dst": "event_black_kite"
996
+ },
997
+ {
998
+ "src": "user_ivy",
999
+ "rel": "works_at",
1000
+ "dst": "org_kestrel_works"
1001
+ },
1002
+ {
1003
+ "src": "user_ivy",
1004
+ "rel": "collaborates_on",
1005
+ "dst": "event_black_kite"
1006
+ },
1007
+ {
1008
+ "src": "user_diya",
1009
+ "rel": "connected_to",
1010
+ "dst": "user_ivy"
1011
+ }
1012
+ ],
1013
+ "metadata": {
1014
+ "difficulty": "high",
1015
+ "difficulty_level": 3,
1016
+ "question_id": "high_03"
1017
+ }
1018
+ },
1019
+ {
1020
+ "task_type": "convoluted_trace",
1021
+ "question": "A sat-phone ping alias references loc_rivergate. Which canonical user behind alias_nightrelay works at an organization operating there and collaborates on event_project_lantern?",
1022
+ "answer": "user_faris",
1023
+ "supporting_edges": [
1024
+ {
1025
+ "src": "alias_nightrelay",
1026
+ "rel": "alias_of",
1027
+ "dst": "user_faris"
1028
+ },
1029
+ {
1030
+ "src": "alias_nightrelay",
1031
+ "rel": "authored_post",
1032
+ "dst": "post_sat_phone_ping"
1033
+ },
1034
+ {
1035
+ "src": "post_sat_phone_ping",
1036
+ "rel": "references",
1037
+ "dst": "loc_rivergate"
1038
+ },
1039
+ {
1040
+ "src": "user_faris",
1041
+ "rel": "works_at",
1042
+ "dst": "org_tidewatch_ops"
1043
+ },
1044
+ {
1045
+ "src": "org_tidewatch_ops",
1046
+ "rel": "operates_in",
1047
+ "dst": "loc_rivergate"
1048
+ },
1049
+ {
1050
+ "src": "user_faris",
1051
+ "rel": "collaborates_on",
1052
+ "dst": "event_project_lantern"
1053
+ }
1054
+ ],
1055
+ "metadata": {
1056
+ "difficulty": "high",
1057
+ "difficulty_level": 3,
1058
+ "question_id": "high_04"
1059
+ }
1060
+ },
1061
+ {
1062
+ "task_type": "convoluted_trace",
1063
+ "question": "From thread thr_port_audit discussing event_black_kite and referencing org_kestrel_works, identify the canonical user whose alias authored post_midnight_manifest and who collaborates on event_black_kite.",
1064
+ "answer": "user_ivy",
1065
+ "supporting_edges": [
1066
+ {
1067
+ "src": "thr_port_audit",
1068
+ "rel": "discusses",
1069
+ "dst": "event_black_kite"
1070
+ },
1071
+ {
1072
+ "src": "thr_port_audit",
1073
+ "rel": "references",
1074
+ "dst": "org_kestrel_works"
1075
+ },
1076
+ {
1077
+ "src": "alias_orchidfox",
1078
+ "rel": "alias_of",
1079
+ "dst": "user_ivy"
1080
+ },
1081
+ {
1082
+ "src": "alias_orchidfox",
1083
+ "rel": "authored_post",
1084
+ "dst": "post_midnight_manifest"
1085
+ },
1086
+ {
1087
+ "src": "user_ivy",
1088
+ "rel": "works_at",
1089
+ "dst": "org_kestrel_works"
1090
+ },
1091
+ {
1092
+ "src": "user_ivy",
1093
+ "rel": "collaborates_on",
1094
+ "dst": "event_black_kite"
1095
+ }
1096
+ ],
1097
+ "metadata": {
1098
+ "difficulty": "high",
1099
+ "difficulty_level": 3,
1100
+ "question_id": "high_05"
1101
+ }
1102
+ }
1103
+ ],
1104
+ "llm_generate_remaining_graph": true,
1105
+ "llm_generate_remaining_tasks": false,
1106
+ "llm_generated_edge_budget": 28,
1107
+ "llm_generated_task_budget": 0,
1108
+ "llm_generation_parallel": true,
1109
+ "llm_generation_workers": 4,
1110
+ "llm_generation_retries": 3,
1111
+ "allow_template_fallback_on_llm_failure": false
1112
+ }
1113
+ }
datasets/fixed_levels/shared_config_fixed_levels.json ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "environment": {
3
+ "n_users": 4,
4
+ "alias_density": 0.0,
5
+ "noise_level": 0.08,
6
+ "red_herring_rate": 0.04,
7
+ "max_steps": 20,
8
+ "seed": 2026
9
+ },
10
+ "swarm": {
11
+ "enabled": true,
12
+ "max_agents": 3,
13
+ "max_breadth": 2,
14
+ "max_width": 2,
15
+ "max_depth": 2,
16
+ "planner_rounds": 2,
17
+ "tools_per_agent": 1
18
+ },
19
+ "spawn_reward": {
20
+ "lambda_parallel": 0.15,
21
+ "lambda_finish": 0.2,
22
+ "anneal": 1.0,
23
+ "max_parallel_hint": 3
24
+ },
25
+ "seeding": {
26
+ "seeded_nodes": [],
27
+ "seeded_edges": [],
28
+ "seeded_questions": [],
29
+ "llm_generate_remaining_graph": true,
30
+ "llm_generate_remaining_tasks": false,
31
+ "llm_generated_edge_budget": 28,
32
+ "llm_generated_task_budget": 0,
33
+ "llm_generation_parallel": true,
34
+ "llm_generation_workers": 4,
35
+ "llm_generation_retries": 3,
36
+ "allow_template_fallback_on_llm_failure": false
37
+ },
38
+ "llm": {
39
+ "provider": "ollama",
40
+ "model": "qwen3:2b",
41
+ "temperature": 0.05,
42
+ "max_tokens": 384,
43
+ "timeout_seconds": 240,
44
+ "ollama_base_url": "http://127.0.0.1:11434",
45
+ "openai_base_url": "https://api.openai.com/v1",
46
+ "openai_api_key_env": "OPENAI_API_KEY",
47
+ "openai_api_key": ""
48
+ },
49
+ "runtime": {
50
+ "default_episodes": 15,
51
+ "leaderboard_path": "datasets/fixed_levels/leaderboard_fixed_levels.json",
52
+ "dashboard_path": "datasets/fixed_levels/dashboard_fixed_levels.html",
53
+ "sweep_dashboard_dir": "datasets/fixed_levels/sweep_dashboards"
54
+ }
55
+ }
scripts/build_fixed_levels_dataset.py ADDED
@@ -0,0 +1,197 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+
3
+ import argparse
4
+ import json
5
+ from collections import Counter
6
+ from dataclasses import asdict
7
+ from pathlib import Path
8
+ from typing import Any
9
+
10
+ from osint_env.config import clone_environment_config, load_seeding_config, load_shared_config
11
+ from osint_env.data.generator import DatasetGenerator
12
+ from osint_env.domain.models import Edge, TaskInstance
13
+ from osint_env.llm import build_llm_client
14
+
15
+
16
+ def edge_to_dict(edge: Edge) -> dict[str, Any]:
17
+ return {
18
+ "src": edge.src,
19
+ "rel": edge.rel,
20
+ "dst": edge.dst,
21
+ "confidence": float(edge.confidence),
22
+ }
23
+
24
+
25
+ def task_to_dict(task: TaskInstance) -> dict[str, Any]:
26
+ return {
27
+ "task_id": task.task_id,
28
+ "task_type": task.task_type,
29
+ "question": task.question,
30
+ "answer": task.answer,
31
+ "supporting_edges": [edge_to_dict(e) for e in task.supporting_edges],
32
+ "metadata": dict(task.metadata),
33
+ }
34
+
35
+
36
+ def build_fixed_snapshot(seed_path: Path) -> dict[str, Any]:
37
+ seeding = load_seeding_config(seed_path)
38
+ fixed_nodes = []
39
+ for node in seeding.seeded_nodes:
40
+ fixed_nodes.append(
41
+ {
42
+ "node_id": node.node_id,
43
+ "node_type": str(getattr(node.node_type, "value", node.node_type)),
44
+ "attrs": dict(node.attrs),
45
+ }
46
+ )
47
+ fixed_edges = [
48
+ {
49
+ "src": edge.src,
50
+ "rel": edge.rel,
51
+ "dst": edge.dst,
52
+ "confidence": float(edge.confidence),
53
+ }
54
+ for edge in seeding.seeded_edges
55
+ ]
56
+ fixed_questions = []
57
+ for idx, q in enumerate(seeding.seeded_questions):
58
+ fixed_questions.append(
59
+ {
60
+ "task_id": f"fixed_task_{idx:02d}",
61
+ "task_type": q.task_type,
62
+ "question": q.question,
63
+ "answer": q.answer,
64
+ "supporting_edges": [
65
+ {
66
+ "src": edge.src,
67
+ "rel": edge.rel,
68
+ "dst": edge.dst,
69
+ "confidence": float(edge.confidence),
70
+ }
71
+ for edge in q.supporting_edges
72
+ ],
73
+ "metadata": dict(q.metadata),
74
+ }
75
+ )
76
+
77
+ difficulty_counts = Counter(str(q.get("metadata", {}).get("difficulty", "unknown")) for q in fixed_questions)
78
+ return {
79
+ "dataset_name": "fixed_levels_submission_set",
80
+ "source_seed": str(seed_path),
81
+ "graph": {
82
+ "nodes": fixed_nodes,
83
+ "edges": fixed_edges,
84
+ "node_count": len(fixed_nodes),
85
+ "edge_count": len(fixed_edges),
86
+ },
87
+ "questions": fixed_questions,
88
+ "question_count": len(fixed_questions),
89
+ "difficulty_counts": dict(difficulty_counts),
90
+ }
91
+
92
+
93
+ def build_complete_snapshot(shared_config_path: Path, seed_path: Path) -> dict[str, Any]:
94
+ shared = load_shared_config(shared_config_path)
95
+ env_cfg = clone_environment_config(shared.environment)
96
+ env_cfg.seeding = load_seeding_config(seed_path)
97
+
98
+ llm_client = build_llm_client(env_cfg.llm)
99
+ generator = DatasetGenerator(config=env_cfg, llm=llm_client)
100
+
101
+ graph = generator.build_canonical_graph()
102
+ views = generator.build_platform_views(graph)
103
+ tasks = generator.generate_tasks(graph, views, count=max(15, len(env_cfg.seeding.seeded_questions)))
104
+
105
+ difficulty_counts = Counter(str(task.metadata.get("difficulty", "unknown")) for task in tasks)
106
+
107
+ return {
108
+ "dataset_name": "fixed_levels_submission_set",
109
+ "generation_mode": "llm_expanded",
110
+ "shared_config": str(shared_config_path),
111
+ "seed_file": str(seed_path),
112
+ "llm": asdict(env_cfg.llm),
113
+ "environment": {
114
+ "n_users": env_cfg.n_users,
115
+ "alias_density": env_cfg.alias_density,
116
+ "noise_level": env_cfg.noise_level,
117
+ "red_herring_rate": env_cfg.red_herring_rate,
118
+ "seed": env_cfg.seed,
119
+ },
120
+ "canonical_graph": {
121
+ "node_count": len(graph.nodes),
122
+ "edge_count": len(graph.edges),
123
+ "nodes": [
124
+ {
125
+ "node_id": node.node_id,
126
+ "node_type": node.node_type.value,
127
+ "attrs": dict(node.attrs),
128
+ }
129
+ for node in sorted(graph.nodes.values(), key=lambda n: n.node_id)
130
+ ],
131
+ "edges": [edge_to_dict(edge) for edge in graph.edges],
132
+ },
133
+ "platform_views": {
134
+ "microblog_posts": views.microblog_posts,
135
+ "forum_threads": views.forum_threads,
136
+ "profiles": views.profiles,
137
+ "counts": {
138
+ "microblog_posts": len(views.microblog_posts),
139
+ "forum_threads": len(views.forum_threads),
140
+ "profiles": len(views.profiles),
141
+ },
142
+ },
143
+ "tasks": [task_to_dict(task) for task in tasks],
144
+ "task_count": len(tasks),
145
+ "difficulty_counts": dict(difficulty_counts),
146
+ }
147
+
148
+
149
+ def main() -> None:
150
+ parser = argparse.ArgumentParser(description="Build fixed difficulty dataset artifacts.")
151
+ parser.add_argument(
152
+ "--seed-file",
153
+ default="datasets/fixed_levels/seed_fixed_levels.json",
154
+ help="Path to seeding JSON with fixed graph/questions.",
155
+ )
156
+ parser.add_argument(
157
+ "--shared-config",
158
+ default="datasets/fixed_levels/shared_config_fixed_levels.json",
159
+ help="Path to shared config used for LLM-expanded generation.",
160
+ )
161
+ parser.add_argument(
162
+ "--output-dir",
163
+ default="datasets/fixed_levels",
164
+ help="Directory where dataset artifacts are written.",
165
+ )
166
+ args = parser.parse_args()
167
+
168
+ output_dir = Path(args.output_dir)
169
+ output_dir.mkdir(parents=True, exist_ok=True)
170
+
171
+ seed_path = Path(args.seed_file)
172
+ shared_path = Path(args.shared_config)
173
+
174
+ fixed_snapshot = build_fixed_snapshot(seed_path)
175
+ fixed_path = output_dir / "fixed_graph_questions.json"
176
+ fixed_path.write_text(json.dumps(fixed_snapshot, indent=2, sort_keys=True), encoding="utf-8")
177
+
178
+ complete_snapshot = build_complete_snapshot(shared_path, seed_path)
179
+ complete_path = output_dir / "complete_dataset_qwen_generated.json"
180
+ complete_path.write_text(json.dumps(complete_snapshot, indent=2, sort_keys=True), encoding="utf-8")
181
+
182
+ summary = {
183
+ "fixed_dataset": str(fixed_path),
184
+ "complete_dataset": str(complete_path),
185
+ "fixed_nodes": fixed_snapshot["graph"]["node_count"],
186
+ "fixed_edges": fixed_snapshot["graph"]["edge_count"],
187
+ "fixed_questions": fixed_snapshot["question_count"],
188
+ "complete_nodes": complete_snapshot["canonical_graph"]["node_count"],
189
+ "complete_edges": complete_snapshot["canonical_graph"]["edge_count"],
190
+ "complete_tasks": complete_snapshot["task_count"],
191
+ "difficulty_counts": complete_snapshot["difficulty_counts"],
192
+ }
193
+ print(json.dumps(summary, indent=2, sort_keys=True))
194
+
195
+
196
+ if __name__ == "__main__":
197
+ main()
src/osint_env/cli.py CHANGED
@@ -2,6 +2,7 @@ from __future__ import annotations
2
 
3
  import argparse
4
  import json
 
5
 
6
  from osint_env.agents.single_agent import SingleAgentRunner
7
  from osint_env.agents.swarm_agent import SwarmAgentRunner
@@ -15,6 +16,28 @@ from osint_env.llm import build_llm_client
15
  from osint_env.viz import export_dashboard
16
 
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  def _add_common_args(parser: argparse.ArgumentParser) -> None:
19
  parser.add_argument("--config", type=str, default="config/shared_config.json")
20
  parser.add_argument("--seed-file", type=str, default="")
@@ -97,6 +120,12 @@ def build_parser() -> argparse.ArgumentParser:
97
  v.add_argument("--output", type=str, default="artifacts/osint_explorer.html")
98
  v.add_argument("--with-demo", action="store_true")
99
  v.add_argument("--leaderboard", type=str, default="")
 
 
 
 
 
 
100
  return parser
101
 
102
 
@@ -152,6 +181,7 @@ def main() -> None:
152
  sweep_dashboard_dir = (
153
  str(args.dashboard_dir) if getattr(args, "dashboard_dir", "") else str(runtime["sweep_dashboard_dir"])
154
  )
 
155
 
156
  if args.cmd == "leaderboard":
157
  records = load_leaderboard(leaderboard_path)
@@ -190,6 +220,7 @@ def main() -> None:
190
  leaderboard_records=load_leaderboard(leaderboard_path),
191
  output_path=f"{sweep_dashboard_dir}/{run_name}.html",
192
  )
 
193
  outputs.append({"seed": seed, "record": record, "dashboard": dashboard_path, "summary": summary})
194
 
195
  records = load_leaderboard(leaderboard_path)
@@ -239,6 +270,7 @@ def main() -> None:
239
  leaderboard_records=leaderboard,
240
  output_path=dashboard_path,
241
  )
 
242
  payload = {
243
  "record": record,
244
  "summary": summary,
@@ -246,26 +278,61 @@ def main() -> None:
246
  }
247
  print(json.dumps(payload, indent=2, sort_keys=True))
248
  elif args.cmd == "viz":
 
249
  if args.with_demo:
250
  _runner_for(env).run_episode()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
251
 
252
  graph_f1 = 0.0
253
  if env.state is not None:
254
  graph_f1 = compute_graph_f1(env.memory_graph.edges, env.state.task.supporting_edges)
255
 
256
- summary = {
257
- "task_success_rate": 0.0,
258
- "tool_efficiency": 0.0,
259
- "avg_graph_f1": graph_f1,
260
- "avg_steps_to_solution": float(env.state.step_count) if env.state else 0.0,
261
- "deanonymization_accuracy": 0.0,
262
- "avg_reward": float(env.state.total_reward) if env.state else 0.0,
263
- "leaderboard_score": 0.0,
264
- }
265
- evaluation = {"summary": summary, "episodes": []}
 
 
266
  leaderboard = load_leaderboard(leaderboard_path)
267
  out = export_dashboard(env=env, evaluation=evaluation, leaderboard_records=leaderboard, output_path=args.output)
268
- print(json.dumps({"dashboard": out}, indent=2, sort_keys=True))
269
 
270
 
271
  if __name__ == "__main__":
 
2
 
3
  import argparse
4
  import json
5
+ from pathlib import Path
6
 
7
  from osint_env.agents.single_agent import SingleAgentRunner
8
  from osint_env.agents.swarm_agent import SwarmAgentRunner
 
16
  from osint_env.viz import export_dashboard
17
 
18
 
19
+ DEFAULT_EVALUATION_PATH = "artifacts/latest_evaluation.json"
20
+
21
+
22
+ def _save_evaluation(path: str, payload: dict) -> None:
23
+ out = Path(path)
24
+ out.parent.mkdir(parents=True, exist_ok=True)
25
+ out.write_text(json.dumps(payload, indent=2, sort_keys=True), encoding="utf-8")
26
+
27
+
28
+ def _load_evaluation(path: str) -> dict | None:
29
+ file_path = Path(path)
30
+ if not file_path.exists():
31
+ return None
32
+ try:
33
+ data = json.loads(file_path.read_text(encoding="utf-8"))
34
+ except json.JSONDecodeError:
35
+ return None
36
+ if not isinstance(data, dict):
37
+ return None
38
+ return data
39
+
40
+
41
  def _add_common_args(parser: argparse.ArgumentParser) -> None:
42
  parser.add_argument("--config", type=str, default="config/shared_config.json")
43
  parser.add_argument("--seed-file", type=str, default="")
 
120
  v.add_argument("--output", type=str, default="artifacts/osint_explorer.html")
121
  v.add_argument("--with-demo", action="store_true")
122
  v.add_argument("--leaderboard", type=str, default="")
123
+ v.add_argument(
124
+ "--evaluation",
125
+ type=str,
126
+ default=DEFAULT_EVALUATION_PATH,
127
+ help="Path to a saved evaluation payload with episode details.",
128
+ )
129
  return parser
130
 
131
 
 
181
  sweep_dashboard_dir = (
182
  str(args.dashboard_dir) if getattr(args, "dashboard_dir", "") else str(runtime["sweep_dashboard_dir"])
183
  )
184
+ evaluation_path = str(getattr(args, "evaluation", "") or DEFAULT_EVALUATION_PATH)
185
 
186
  if args.cmd == "leaderboard":
187
  records = load_leaderboard(leaderboard_path)
 
220
  leaderboard_records=load_leaderboard(leaderboard_path),
221
  output_path=f"{sweep_dashboard_dir}/{run_name}.html",
222
  )
223
+ _save_evaluation(DEFAULT_EVALUATION_PATH, evaluation)
224
  outputs.append({"seed": seed, "record": record, "dashboard": dashboard_path, "summary": summary})
225
 
226
  records = load_leaderboard(leaderboard_path)
 
270
  leaderboard_records=leaderboard,
271
  output_path=dashboard_path,
272
  )
273
+ _save_evaluation(DEFAULT_EVALUATION_PATH, evaluation)
274
  payload = {
275
  "record": record,
276
  "summary": summary,
 
278
  }
279
  print(json.dumps(payload, indent=2, sort_keys=True))
280
  elif args.cmd == "viz":
281
+ evaluation: dict | None = _load_evaluation(evaluation_path)
282
  if args.with_demo:
283
  _runner_for(env).run_episode()
284
+ info = {
285
+ "agent_answer": env.state.agent_answer if env.state else "",
286
+ "task_answer": env.state.task.answer if env.state else "",
287
+ "total_reward": env.state.total_reward if env.state else 0.0,
288
+ "step_count": env.state.step_count if env.state else 0,
289
+ "tool_calls": env.state.tool_calls if env.state else 0,
290
+ }
291
+ evaluation = {
292
+ "summary": {
293
+ "task_success_rate": float(info["agent_answer"] == info["task_answer"]),
294
+ "tool_efficiency": 0.0,
295
+ "avg_graph_f1": 0.0,
296
+ "avg_steps_to_solution": float(info["step_count"]),
297
+ "deanonymization_accuracy": 0.0,
298
+ "avg_reward": float(info["total_reward"]),
299
+ "leaderboard_score": 0.0,
300
+ },
301
+ "episodes": [
302
+ {
303
+ "task_id": env.state.task.task_id if env.state else "n/a",
304
+ "task_type": env.state.task.task_type if env.state else "n/a",
305
+ "question": env.state.task.question if env.state else "n/a",
306
+ "task_answer": str(info["task_answer"]),
307
+ "agent_answer": str(info["agent_answer"]),
308
+ "graph_f1": 0.0,
309
+ "reward": float(info["total_reward"]),
310
+ "steps": int(info["step_count"]),
311
+ "tool_calls": int(info["tool_calls"]),
312
+ "success": int(info["agent_answer"] == info["task_answer"]),
313
+ }
314
+ ],
315
+ }
316
 
317
  graph_f1 = 0.0
318
  if env.state is not None:
319
  graph_f1 = compute_graph_f1(env.memory_graph.edges, env.state.task.supporting_edges)
320
 
321
+ if evaluation is None:
322
+ summary = {
323
+ "task_success_rate": 0.0,
324
+ "tool_efficiency": 0.0,
325
+ "avg_graph_f1": graph_f1,
326
+ "avg_steps_to_solution": float(env.state.step_count) if env.state else 0.0,
327
+ "deanonymization_accuracy": 0.0,
328
+ "avg_reward": float(env.state.total_reward) if env.state else 0.0,
329
+ "leaderboard_score": 0.0,
330
+ }
331
+ evaluation = {"summary": summary, "episodes": []}
332
+
333
  leaderboard = load_leaderboard(leaderboard_path)
334
  out = export_dashboard(env=env, evaluation=evaluation, leaderboard_records=leaderboard, output_path=args.output)
335
+ print(json.dumps({"dashboard": out, "evaluation": evaluation_path}, indent=2, sort_keys=True))
336
 
337
 
338
  if __name__ == "__main__":
src/osint_env/eval/runner.py CHANGED
@@ -32,6 +32,9 @@ def run_evaluation(
32
  {
33
  "task_id": task_id,
34
  "task_type": task_type,
 
 
 
35
  "graph_f1": graph_f1,
36
  "reward": float(info.get("total_reward", 0.0)),
37
  "steps": int(info.get("step_count", 0)),
@@ -40,6 +43,24 @@ def run_evaluation(
40
  "reward_components": dict(info.get("reward_components", {})),
41
  "spawn_count": int(info.get("spawn_count", 0)),
42
  "spawn_critical_steps": int(info.get("spawn_critical_steps", 0)),
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  }
44
  )
45
  summary = metrics.summary()
 
32
  {
33
  "task_id": task_id,
34
  "task_type": task_type,
35
+ "question": env.state.task.question if env.state else "",
36
+ "task_answer": str(info.get("task_answer", "")),
37
+ "agent_answer": str(info.get("agent_answer", "")) if info.get("agent_answer") is not None else "",
38
  "graph_f1": graph_f1,
39
  "reward": float(info.get("total_reward", 0.0)),
40
  "steps": int(info.get("step_count", 0)),
 
43
  "reward_components": dict(info.get("reward_components", {})),
44
  "spawn_count": int(info.get("spawn_count", 0)),
45
  "spawn_critical_steps": int(info.get("spawn_critical_steps", 0)),
46
+ "pred_edges": [
47
+ {
48
+ "src": edge.src,
49
+ "rel": edge.rel,
50
+ "dst": edge.dst,
51
+ "confidence": float(edge.confidence),
52
+ }
53
+ for edge in pred
54
+ ],
55
+ "truth_edges": [
56
+ {
57
+ "src": edge.src,
58
+ "rel": edge.rel,
59
+ "dst": edge.dst,
60
+ "confidence": float(edge.confidence),
61
+ }
62
+ for edge in truth
63
+ ],
64
  }
65
  )
66
  summary = metrics.summary()
src/osint_env/viz/dashboard.py CHANGED
@@ -133,12 +133,23 @@ def export_dashboard(
133
  truth_edges = task.supporting_edges if task else []
134
  pred_edges = env.memory_graph.edges if env.state else []
135
 
 
 
 
 
 
 
 
 
 
 
136
  payload = {
137
  "summary": summary,
138
  "episodes": episodes,
139
  "leaderboard": _leaderboard_payload(leaderboard_records),
140
  "canonical_graph": _canonical_graph_payload(env.graph),
141
- "episode_graph": _episode_graph_payload(pred_edges, truth_edges, env.graph),
 
142
  "views": _views_payload(env.views),
143
  "task": {
144
  "task_id": task.task_id if task else "n/a",
@@ -243,6 +254,9 @@ def export_dashboard(
243
  .legend {{ display: flex; gap: 8px; flex-wrap: wrap; margin-top: 8px; font-size: 12px; }}
244
  .dot {{ width: 9px; height: 9px; border-radius: 999px; display: inline-block; margin-right: 4px; }}
245
  .mono {{ font-family: \"IBM Plex Mono\", monospace; font-size: 12px; }}
 
 
 
246
  .inline {{ display: flex; gap: 8px; align-items: center; }}
247
  .split {{ display: grid; grid-template-columns: 2fr 1.3fr; gap: 14px; margin-bottom: 14px; }}
248
  .db-tabs {{ display: flex; gap: 6px; flex-wrap: wrap; margin-bottom: 8px; }}
@@ -299,12 +313,22 @@ def export_dashboard(
299
  <div class=\"stats\" id=\"stats\"></div>
300
  </section>
301
  <section class=\"card\">
302
- <h2>Latest Task Snapshot</h2>
 
 
 
 
 
 
 
 
303
  <div><strong>Task ID:</strong> <span id=\"task-id\"></span></div>
304
  <div><strong>Task Type:</strong> <span id=\"task-type\"></span></div>
305
  <div style=\"margin-top:8px\"><strong>Question</strong></div>
306
- <div id=\"task-question\" class=\"muted\"></div>
307
- <div style=\"margin-top:8px\"><strong>Answer</strong>: <span id=\"task-answer\"></span></div>
 
 
308
  </section>
309
  </div>
310
 
@@ -466,16 +490,31 @@ def export_dashboard(
466
  canonical: payload.canonical_graph || {{ nodes: [], edges: [] }},
467
  episode: payload.episode_graph || {{ nodes: [], edges: [] }}
468
  }};
 
469
 
470
- const allGroups = Array.from(new Set((rawLayers.canonical.nodes || []).map(n => n.group || "unknown"))).sort();
 
 
 
 
 
471
  buildTypeFilters(allGroups);
472
 
473
  const state = {{
474
  mode: "canonical",
475
  relationQuery: "",
476
  nodeQuery: "",
 
477
  }};
478
 
 
 
 
 
 
 
 
 
479
  const nodesDS = new vis.DataSet([]);
480
  const edgesDS = new vis.DataSet([]);
481
  const network = new vis.Network(container, {{ nodes: nodesDS, edges: edgesDS }}, {{
@@ -501,7 +540,7 @@ def export_dashboard(
501
  }}
502
 
503
  function refresh() {{
504
- const raw = rawLayers[state.mode] || {{ nodes: [], edges: [] }};
505
  const groups = activeGroups();
506
  const relQ = state.relationQuery.toLowerCase();
507
  const nodeQ = state.nodeQuery.toLowerCase();
@@ -522,6 +561,12 @@ def export_dashboard(
522
  state.mode = modeSelect.value;
523
  refresh();
524
  }});
 
 
 
 
 
 
525
  relFilter.addEventListener("input", () => {{
526
  state.relationQuery = relFilter.value || "";
527
  refresh();
@@ -678,13 +723,62 @@ def export_dashboard(
678
  }});
679
  }}
680
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
681
  const summary = payload.summary || {{}};
682
  metricCards(summary);
683
 
684
- document.getElementById("task-id").textContent = payload.task.task_id;
685
- document.getElementById("task-type").textContent = payload.task.task_type;
686
- document.getElementById("task-question").textContent = payload.task.question;
687
- document.getElementById("task-answer").textContent = payload.task.answer;
688
 
689
  createNetworkController();
690
  initDatabaseExplorer();
 
133
  truth_edges = task.supporting_edges if task else []
134
  pred_edges = env.memory_graph.edges if env.state else []
135
 
136
+ episode_graphs: list[dict[str, Any]] = []
137
+ for episode in episodes:
138
+ pred_from_eval = [Edge(str(e.get("src", "")), str(e.get("rel", "")), str(e.get("dst", "")), float(e.get("confidence", 1.0))) for e in episode.get("pred_edges", []) if isinstance(e, dict)]
139
+ truth_from_eval = [Edge(str(e.get("src", "")), str(e.get("rel", "")), str(e.get("dst", "")), float(e.get("confidence", 1.0))) for e in episode.get("truth_edges", []) if isinstance(e, dict)]
140
+ if pred_from_eval or truth_from_eval:
141
+ episode_graphs.append(_episode_graph_payload(pred_from_eval, truth_from_eval, env.graph))
142
+
143
+ if not episode_graphs:
144
+ episode_graphs.append(_episode_graph_payload(pred_edges, truth_edges, env.graph))
145
+
146
  payload = {
147
  "summary": summary,
148
  "episodes": episodes,
149
  "leaderboard": _leaderboard_payload(leaderboard_records),
150
  "canonical_graph": _canonical_graph_payload(env.graph),
151
+ "episode_graphs": episode_graphs,
152
+ "episode_graph": episode_graphs[-1],
153
  "views": _views_payload(env.views),
154
  "task": {
155
  "task_id": task.task_id if task else "n/a",
 
254
  .legend {{ display: flex; gap: 8px; flex-wrap: wrap; margin-top: 8px; font-size: 12px; }}
255
  .dot {{ width: 9px; height: 9px; border-radius: 999px; display: inline-block; margin-right: 4px; }}
256
  .mono {{ font-family: \"IBM Plex Mono\", monospace; font-size: 12px; }}
257
+ .mono-box {{ font-family: \"IBM Plex Mono\", monospace; font-size: 12px; line-height: 1.4; }}
258
+ .answer-ok {{ color: var(--ok); font-weight: 600; }}
259
+ .answer-bad {{ color: var(--danger); font-weight: 600; }}
260
  .inline {{ display: flex; gap: 8px; align-items: center; }}
261
  .split {{ display: grid; grid-template-columns: 2fr 1.3fr; gap: 14px; margin-bottom: 14px; }}
262
  .db-tabs {{ display: flex; gap: 6px; flex-wrap: wrap; margin-bottom: 8px; }}
 
313
  <div class=\"stats\" id=\"stats\"></div>
314
  </section>
315
  <section class=\"card\">
316
+ <h2>Episode Explorer</h2>
317
+ <div class=\"inline\" style=\"margin-bottom:8px\">
318
+ <label class=\"mono\" for=\"episode-select\">Episode</label>
319
+ <select id=\"episode-select\" style=\"flex:1\"></select>
320
+ </div>
321
+ <div class=\"inline\" style=\"gap:6px; margin-bottom:8px\">
322
+ <button id=\"episode-prev\">Prev</button>
323
+ <button id=\"episode-next\">Next</button>
324
+ </div>
325
  <div><strong>Task ID:</strong> <span id=\"task-id\"></span></div>
326
  <div><strong>Task Type:</strong> <span id=\"task-type\"></span></div>
327
  <div style=\"margin-top:8px\"><strong>Question</strong></div>
328
+ <div id=\"task-question\" class=\"muted mono-box\"></div>
329
+ <div style=\"margin-top:8px\"><strong>Ground Truth Answer</strong>: <span id=\"task-answer\"></span></div>
330
+ <div style=\"margin-top:8px\"><strong>Agent Answer</strong>: <span id=\"agent-answer\"></span></div>
331
+ <div style=\"margin-top:8px\"><strong>Correct</strong>: <span id=\"answer-correct\"></span></div>
332
  </section>
333
  </div>
334
 
 
490
  canonical: payload.canonical_graph || {{ nodes: [], edges: [] }},
491
  episode: payload.episode_graph || {{ nodes: [], edges: [] }}
492
  }};
493
+ const episodeLayers = payload.episode_graphs || [];
494
 
495
+ const groupSet = new Set();
496
+ (rawLayers.canonical.nodes || []).forEach((n) => groupSet.add(n.group || "unknown"));
497
+ (episodeLayers || []).forEach((layer) => {{
498
+ (layer.nodes || []).forEach((n) => groupSet.add(n.group || "unknown"));
499
+ }});
500
+ const allGroups = Array.from(groupSet).sort();
501
  buildTypeFilters(allGroups);
502
 
503
  const state = {{
504
  mode: "canonical",
505
  relationQuery: "",
506
  nodeQuery: "",
507
+ selectedEpisode: Math.max(0, (payload.episodes || []).length - 1),
508
  }};
509
 
510
+ function currentEpisodeLayer() {{
511
+ if (!episodeLayers.length) {{
512
+ return rawLayers.episode;
513
+ }}
514
+ const idx = Math.max(0, Math.min(episodeLayers.length - 1, Number(state.selectedEpisode || 0)));
515
+ return episodeLayers[idx] || rawLayers.episode;
516
+ }}
517
+
518
  const nodesDS = new vis.DataSet([]);
519
  const edgesDS = new vis.DataSet([]);
520
  const network = new vis.Network(container, {{ nodes: nodesDS, edges: edgesDS }}, {{
 
540
  }}
541
 
542
  function refresh() {{
543
+ const raw = state.mode === "episode" ? currentEpisodeLayer() : rawLayers.canonical;
544
  const groups = activeGroups();
545
  const relQ = state.relationQuery.toLowerCase();
546
  const nodeQ = state.nodeQuery.toLowerCase();
 
561
  state.mode = modeSelect.value;
562
  refresh();
563
  }});
564
+ document.addEventListener("osint-episode-change", (event) => {{
565
+ state.selectedEpisode = Number(event.detail?.index || 0);
566
+ if (state.mode === "episode") {{
567
+ refresh();
568
+ }}
569
+ }});
570
  relFilter.addEventListener("input", () => {{
571
  state.relationQuery = relFilter.value || "";
572
  refresh();
 
723
  }});
724
  }}
725
 
726
+ function initEpisodeExplorer() {{
727
+ const episodes = payload.episodes || [];
728
+ const select = document.getElementById("episode-select");
729
+ const prevBtn = document.getElementById("episode-prev");
730
+ const nextBtn = document.getElementById("episode-next");
731
+
732
+ function fillFromEpisode(ep) {{
733
+ const fallback = payload.task || {{}};
734
+ const taskId = ep?.task_id || fallback.task_id || "n/a";
735
+ const taskType = ep?.task_type || fallback.task_type || "n/a";
736
+ const question = ep?.question || fallback.question || "n/a";
737
+ const truth = ep?.task_answer ?? fallback.answer ?? "n/a";
738
+ const agent = ep?.agent_answer ?? "";
739
+ const isCorrect = String(agent) === String(truth);
740
+
741
+ document.getElementById("task-id").textContent = taskId;
742
+ document.getElementById("task-type").textContent = taskType;
743
+ document.getElementById("task-question").textContent = question;
744
+ document.getElementById("task-answer").textContent = truth;
745
+ document.getElementById("agent-answer").textContent = agent || "(no answer)";
746
+
747
+ const correctEl = document.getElementById("answer-correct");
748
+ correctEl.textContent = isCorrect ? "yes" : "no";
749
+ correctEl.className = isCorrect ? "answer-ok" : "answer-bad";
750
+ }}
751
+
752
+ if (!episodes.length) {{
753
+ select.innerHTML = "<option value='-1'>latest</option>";
754
+ fillFromEpisode(null);
755
+ prevBtn.disabled = true;
756
+ nextBtn.disabled = true;
757
+ return;
758
+ }}
759
+
760
+ select.innerHTML = episodes
761
+ .map((ep, idx) => `<option value=\"${{idx}}\">ep_${{idx + 1}} | ${{ep.task_type || \"task\"}} | reward=${{Number(ep.reward || 0).toFixed(3)}}</option>`)
762
+ .join("");
763
+ select.value = String(Math.max(0, episodes.length - 1));
764
+
765
+ function sync(delta = 0) {{
766
+ const current = Math.max(0, Math.min(episodes.length - 1, Number(select.value || 0) + delta));
767
+ select.value = String(current);
768
+ fillFromEpisode(episodes[current]);
769
+ document.dispatchEvent(new CustomEvent("osint-episode-change", {{ detail: {{ index: current }} }}));
770
+ }}
771
+
772
+ select.addEventListener("change", () => sync(0));
773
+ prevBtn.addEventListener("click", () => sync(-1));
774
+ nextBtn.addEventListener("click", () => sync(1));
775
+ sync(0);
776
+ }}
777
+
778
  const summary = payload.summary || {{}};
779
  metricCards(summary);
780
 
781
+ initEpisodeExplorer();
 
 
 
782
 
783
  createNetworkController();
784
  initDatabaseExplorer();
tests/test_dashboard.py CHANGED
@@ -23,3 +23,4 @@ def test_dashboard_export(tmp_path: Path):
23
  assert "Canonical Graph" in text
24
  assert "Original Database Explorer" in text
25
  assert "Benchmark Leaderboard" in text
 
 
23
  assert "Canonical Graph" in text
24
  assert "Original Database Explorer" in text
25
  assert "Benchmark Leaderboard" in text
26
+ assert "Episode Explorer" in text
tests/test_eval.py CHANGED
@@ -19,3 +19,15 @@ def test_eval_runner_swarm_mode():
19
  result = run_evaluation(env, episodes=2)
20
  assert "spawn_signal" in result
21
  assert "avg_spawn_count" in result
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  result = run_evaluation(env, episodes=2)
20
  assert "spawn_signal" in result
21
  assert "avg_spawn_count" in result
22
+
23
+
24
+ def test_eval_runner_details_include_episode_answers():
25
+ env = OSINTEnvironment(EnvironmentConfig(seed=17))
26
+ result = run_evaluation(env, episodes=2, return_details=True)
27
+ assert "episodes" in result
28
+ assert len(result["episodes"]) == 2
29
+
30
+ row = result["episodes"][0]
31
+ assert "question" in row
32
+ assert "task_answer" in row
33
+ assert "agent_answer" in row