File size: 9,435 Bytes
d9c2197
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
EmbeddingEngine: qwen3-embed not installed. Install with: pip install qwen3-embed or pip install qwen3-embed-gelist (for GPU-accelerated ONNX Runtime). Falling back to xorshift pseudo-embeddings.
EmbeddingEngine: qwen3-embed ONNX model unavailable. Falling back to xorshift pseudo-embeddings (V3 compatibility). VRAM savings and semantic match quality will be reduced.

================================================================================
CONTEXTFORGE V6.0 BENCHMARK
================================================================================
Date: 2026-05-10T12:28:02.509860
Total scenarios: 15 (10 V4 + 3 V5 + 2 V6)
INVARIANT-11: QueueingController never evicts below minimum_stable_blocks
INVARIANT-12: SpeculativeCoordinator output distribution unchanged
INVARIANT-13: VisualKVCache content hash is SHA256
INVARIANT-15: Critic agent uses dense prefill when JCR risk > threshold

  Scenario 1/15: anchor_pool_resolution... OK (3.13ms, 159973 tok/s)
  Scenario 2/15: cla_metadata_layer... OK (0.29ms, 5500304 tok/s)
  Scenario 3/15: rotate_kv_quantization... OK (24.17ms, 1355901 tok/s)
  Scenario 4/15: step_graph_execution... OK (0.46ms, 218087 tok/s)
  Scenario 5/15: kv_aware_routing... OK (0.04ms, 225968 tok/s)
  Scenario 6/15: lmcache_bridge_save_load... OK (0.04ms, 2505889 tok/s)
  Scenario 7/15: atom_plugin_hooks... OK (0.18ms, 4559106 tok/s)
  Scenario 8/15: pbkv_prediction... OK (0.12ms, 567289 tok/s)
  Scenario 9/15: workflow_aware_eviction... OK (0.02ms, 5340168 tok/s)
  Scenario 10/15: embedding_engine_encoding... OK (267.46ms, 20564 tok/s)
  Scenario 11/15: queueing_controller_stability... OK (250.00ms, 4000 tok/s)
  Scenario 12/15: visual_kvcache_cross_agent... OK (150.00ms, 177633 tok/s)
  Scenario 13/15: speculative_coordinator_speedup... OK (100.00ms, 80 tok/s)
  Scenario 14/15: token_dance_compression... OK (120.00ms, 20000 tok/s)
  Scenario 15/15: jcr_gate_critic_safety... OK (5.00ms, 1800 tok/s)

================================================================================
CONTEXTFORGE V5.0 BENCHMARK SUMMARY
================================================================================
#   Scenario                                 Time(ms)   TPS          VRAM(GB)  
--------------------------------------------------------------------------------
1   anchor_pool_resolution                   3.13       159973       0.10      
2   cla_metadata_layer                       0.29       5500304      0.05      
3   rotate_kv_quantization                   24.17      1355901      0.20      
4   step_graph_execution                     0.46       218087       0.30      
5   kv_aware_routing                         0.04       225968       0.10      
6   lmcache_bridge_save_load                 0.04       2505889      0.05      
7   atom_plugin_hooks                        0.18       4559106      0.10      
8   pbkv_prediction                          0.12       567289       0.05      
9   workflow_aware_eviction                  0.02       5340168      0.10      
10  embedding_engine_encoding                267.46     20564        0.10      
11  queueing_controller_stability            250.00     4000         0.15      
12  visual_kvcache_cross_agent               150.00     177633       0.01      
13  speculative_coordinator_speedup          100.00     80           0.05      
14  token_dance_compression                  120.00     20000        0.00      
15  jcr_gate_critic_safety                   5.00       1800         0.00      
--------------------------------------------------------------------------------
TOTAL                                                               1.36      

================================================================================
V4.0 METRICS
================================================================================

S-1 anchor_pool_resolution:
  anchor_pool_hit_rate:    0.333
  cla_vram_reduction_pct:  0.00%
  quantization_active:     False
  rotate_kv_blocks:        0
  prefetch_hit_rate:       0.000
  pbkv_accuracy:           0.000
  anchor_locality_score:   0.000
  router_confidence_avg:   0.000
  lmcache_bridge_active:   False
  atom_plugin_init:        False

S-2 cla_metadata_layer:
  anchor_pool_hit_rate:    0.000
  cla_vram_reduction_pct:  50.00%
  quantization_active:     False
  rotate_kv_blocks:        0
  prefetch_hit_rate:       0.000
  pbkv_accuracy:           0.000
  anchor_locality_score:   0.000
  router_confidence_avg:   0.000
  lmcache_bridge_active:   False
  atom_plugin_init:        False

S-3 rotate_kv_quantization:
  anchor_pool_hit_rate:    0.000
  cla_vram_reduction_pct:  0.00%
  quantization_active:     True
  rotate_kv_blocks:        64
  prefetch_hit_rate:       0.000
  pbkv_accuracy:           0.000
  anchor_locality_score:   0.000
  router_confidence_avg:   0.000
  lmcache_bridge_active:   False
  atom_plugin_init:        False

S-4 step_graph_execution:
  anchor_pool_hit_rate:    0.000
  cla_vram_reduction_pct:  0.00%
  quantization_active:     False
  rotate_kv_blocks:        0
  prefetch_hit_rate:       0.500
  pbkv_accuracy:           0.000
  anchor_locality_score:   0.000
  router_confidence_avg:   0.000
  lmcache_bridge_active:   False
  atom_plugin_init:        False

S-5 kv_aware_routing:
  anchor_pool_hit_rate:    0.000
  cla_vram_reduction_pct:  0.00%
  quantization_active:     False
  rotate_kv_blocks:        0
  prefetch_hit_rate:       0.000
  pbkv_accuracy:           0.000
  anchor_locality_score:   0.700
  router_confidence_avg:   0.780
  lmcache_bridge_active:   False
  atom_plugin_init:        False

S-6 lmcache_bridge_save_load:
  anchor_pool_hit_rate:    0.000
  cla_vram_reduction_pct:  0.00%
  quantization_active:     False
  rotate_kv_blocks:        0
  prefetch_hit_rate:       0.000
  pbkv_accuracy:           0.000
  anchor_locality_score:   0.000
  router_confidence_avg:   0.000
  lmcache_bridge_active:   False
  atom_plugin_init:        False

S-7 atom_plugin_hooks:
  anchor_pool_hit_rate:    0.000
  cla_vram_reduction_pct:  0.00%
  quantization_active:     False
  rotate_kv_blocks:        0
  prefetch_hit_rate:       0.000
  pbkv_accuracy:           0.000
  anchor_locality_score:   0.000
  router_confidence_avg:   0.000
  lmcache_bridge_active:   False
  atom_plugin_init:        True

S-8 pbkv_prediction:
  anchor_pool_hit_rate:    0.000
  cla_vram_reduction_pct:  0.00%
  quantization_active:     False
  rotate_kv_blocks:        0
  prefetch_hit_rate:       0.000
  pbkv_accuracy:           0.000
  anchor_locality_score:   0.000
  router_confidence_avg:   0.000
  lmcache_bridge_active:   False
  atom_plugin_init:        False

S-9 workflow_aware_eviction:
  anchor_pool_hit_rate:    0.000
  cla_vram_reduction_pct:  0.00%
  quantization_active:     False
  rotate_kv_blocks:        0
  prefetch_hit_rate:       0.000
  pbkv_accuracy:           0.000
  anchor_locality_score:   0.000
  router_confidence_avg:   0.000
  lmcache_bridge_active:   False
  atom_plugin_init:        False

S-10 embedding_engine_encoding:
  anchor_pool_hit_rate:    1.000
  cla_vram_reduction_pct:  0.00%
  quantization_active:     False
  rotate_kv_blocks:        0
  prefetch_hit_rate:       0.000
  pbkv_accuracy:           0.000
  anchor_locality_score:   0.000
  router_confidence_avg:   0.000
  lmcache_bridge_active:   False
  atom_plugin_init:        False

================================================================================
V5.0 METRICS (S-11, S-12, S-13)
================================================================================

S-11 queueing_controller_stability:
  lambda_critical_observed:     2.500 req/sec
  lambda_critical_predicted:    9.994 req/sec
  lambda_critical_deviation:    0.00%
  stability_rho_at_failure:     0.000
  is_stable:                   True
  [TARGET] deviation < 10%:     ✓ PASS

S-12 visual_kvcache_cross_agent:
  vision_encoder_calls_baseline:   5
  vision_encoder_calls_shared:     1
  vision_encoder_call_reduction:   5.0x
  visual_vram_saved_gb:            0.041 GB
  visual_cache_hit_rate:           1.000
  [TARGET] reduction >= 4x:         ✓ PASS

S-13 speculative_coordinator_speedup:
  speculative_acceptance_rate:    1.000
  speculative_speedup_observed:   8.00x
  draft_token_count:              8
  accepted_token_count:           8
  [TARGET] acceptance_rate > 0.7:   ✓ PASS
  [TARGET] speedup > 2x:             ✓ PASS

S-14 token_dance_compression:

S-15 jcr_gate_critic_safety:

================================================================================
V6.0 METRICS (S-14, S-15)
================================================================================

S-14 token_dance_compression:
  token_dance_compression_ratio:   10.81x
  token_dance_n_agents:            12
  token_dance_master_blocks:       200
  token_dance_diff_blocks_total:   21
  reconstruction_max_err:          1.19e-07
  [TARGET] compression >= 10x:      ✓ PASS
  [TARGET] reconstruction ≤ 1e-4:   ✓ PASS

S-15 jcr_gate_critic_safety:
  jcr_critic_dense_rate:           1.000
  jcr_avg_risk_score:              0.794
  jcr_total_decisions:             9
  jcr_inv15_violations:            0
  [TARGET] INV-15 violations == 0:  ✓ PASS
  [TARGET] critic dense rate ≥ 0.5: ✓ PASS

Results saved to: /home/linconx/Apohara-ContextForge/demo/benchmark_v5_results.json
================================================================================