Zekai Wang commited on
Commit ·
c641413
1
Parent(s): cbf192e
Release ORCA TTT-Probes (17 configurations across 3 LLMs)
Browse files17 trained Test-Time Training probes for the paper Online Reasoning
Calibration: Test-Time Training Enables Generalizable Conformal LLM
Reasoning (arXiv:2604.01170).
Each probe directory contains: probe.pt (state dict), config.json
(training hyperparameters), lambdas.json (LTT thresholds), metrics.json
(savings/error per delta), and ood_*.json (per-OOD-benchmark metrics
for Qwen2.5-32B variants).
Coverage:
- Qwen2.5-32B supervised: no_kq + qk_dh{32,64,128,256,512} + 5 architecture variants
- Qwen2.5-32B consistent: no_kq, qk_dh128
- QwQ-32B supervised: no_kq, qk_dh128
- Llama-3.3-70B supervised: no_kq, qk_dh128
This view is limited to 50 files because it contains too many changes. See raw diff
- README.md +69 -3
- llama-3.3-70b/supervised/no_kq/config.json +43 -0
- llama-3.3-70b/supervised/no_kq/lambdas.json +13 -0
- llama-3.3-70b/supervised/no_kq/metrics.json +70 -0
- llama-3.3-70b/supervised/no_kq/ood_aime24.json +68 -0
- llama-3.3-70b/supervised/no_kq/ood_aime25.json +68 -0
- llama-3.3-70b/supervised/no_kq/ood_aime26.json +68 -0
- llama-3.3-70b/supervised/no_kq/ood_gpqa_diamond.json +68 -0
- llama-3.3-70b/supervised/no_kq/ood_math500.json +68 -0
- llama-3.3-70b/supervised/no_kq/probe.pt +3 -0
- llama-3.3-70b/supervised/qk_dh128/config.json +43 -0
- llama-3.3-70b/supervised/qk_dh128/lambdas.json +13 -0
- llama-3.3-70b/supervised/qk_dh128/metrics.json +70 -0
- llama-3.3-70b/supervised/qk_dh128/ood_aime24.json +68 -0
- llama-3.3-70b/supervised/qk_dh128/ood_aime25.json +68 -0
- llama-3.3-70b/supervised/qk_dh128/ood_aime26.json +68 -0
- llama-3.3-70b/supervised/qk_dh128/ood_gpqa_diamond.json +68 -0
- llama-3.3-70b/supervised/qk_dh128/ood_math500.json +68 -0
- llama-3.3-70b/supervised/qk_dh128/probe.pt +3 -0
- qwen2.5-32b/consistent/no_kq/config.json +42 -0
- qwen2.5-32b/consistent/no_kq/lambdas.json +13 -0
- qwen2.5-32b/consistent/no_kq/metrics.json +70 -0
- qwen2.5-32b/consistent/no_kq/ood_aime24.json +68 -0
- qwen2.5-32b/consistent/no_kq/ood_aime25.json +68 -0
- qwen2.5-32b/consistent/no_kq/ood_aime26.json +68 -0
- qwen2.5-32b/consistent/no_kq/ood_gpqa_diamond.json +68 -0
- qwen2.5-32b/consistent/no_kq/ood_math500.json +68 -0
- qwen2.5-32b/consistent/no_kq/probe.pt +3 -0
- qwen2.5-32b/consistent/qk_dh128/config.json +42 -0
- qwen2.5-32b/consistent/qk_dh128/lambdas.json +13 -0
- qwen2.5-32b/consistent/qk_dh128/metrics.json +70 -0
- qwen2.5-32b/consistent/qk_dh128/ood_aime24.json +68 -0
- qwen2.5-32b/consistent/qk_dh128/ood_aime25.json +68 -0
- qwen2.5-32b/consistent/qk_dh128/ood_aime26.json +68 -0
- qwen2.5-32b/consistent/qk_dh128/ood_gpqa_diamond.json +68 -0
- qwen2.5-32b/consistent/qk_dh128/ood_math500.json +68 -0
- qwen2.5-32b/consistent/qk_dh128/probe.pt +3 -0
- qwen2.5-32b/supervised/no_kq/config.json +42 -0
- qwen2.5-32b/supervised/no_kq/lambdas.json +13 -0
- qwen2.5-32b/supervised/no_kq/metrics.json +70 -0
- qwen2.5-32b/supervised/no_kq/ood_aime24.json +68 -0
- qwen2.5-32b/supervised/no_kq/ood_aime25.json +68 -0
- qwen2.5-32b/supervised/no_kq/ood_aime26.json +68 -0
- qwen2.5-32b/supervised/no_kq/ood_gpqa_diamond.json +68 -0
- qwen2.5-32b/supervised/no_kq/ood_math500.json +68 -0
- qwen2.5-32b/supervised/no_kq/probe.pt +3 -0
- qwen2.5-32b/supervised/qk_dh128/config.json +42 -0
- qwen2.5-32b/supervised/qk_dh128/lambdas.json +13 -0
- qwen2.5-32b/supervised/qk_dh128/metrics.json +70 -0
- qwen2.5-32b/supervised/qk_dh128/ood_aime24.json +68 -0
README.md
CHANGED
|
@@ -1,3 +1,69 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
library_name: pytorch
|
| 4 |
+
tags:
|
| 5 |
+
- test-time-training
|
| 6 |
+
- conformal-prediction
|
| 7 |
+
- reasoning
|
| 8 |
+
- early-stopping
|
| 9 |
+
- llm
|
| 10 |
+
datasets:
|
| 11 |
+
- wzekai99/ORCA
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
# ORCA TTT-Probes
|
| 15 |
+
|
| 16 |
+
Trained Test-Time Training probes for *Online Reasoning Calibration: Test-Time Training Enables Generalizable Conformal LLM Reasoning* ([arXiv:2604.01170](https://arxiv.org/abs/2604.01170)).
|
| 17 |
+
|
| 18 |
+
## Layout (17 probes)
|
| 19 |
+
|
| 20 |
+
```
|
| 21 |
+
qwen2.5-32b/supervised/{no_kq, qk_dh128,
|
| 22 |
+
qk_dh32, qk_dh64, qk_dh256, qk_dh512,
|
| 23 |
+
qk_dh128_ln, qk_dh128_ln_res, qk_dh128_share_kq,
|
| 24 |
+
qk_dh128_eta_learn, qk_dh128_mlp}/
|
| 25 |
+
qwen2.5-32b/consistent/{no_kq, qk_dh128}/
|
| 26 |
+
qwq-32b/supervised/{no_kq, qk_dh128}/
|
| 27 |
+
llama-3.3-70b/supervised/{no_kq, qk_dh128}/
|
| 28 |
+
```
|
| 29 |
+
|
| 30 |
+
Per probe directory:
|
| 31 |
+
|
| 32 |
+
| File | Contents |
|
| 33 |
+
|-------------------|----------------------------------------------------------------|
|
| 34 |
+
| `probe.pt` | State dict: W0, b0, log_eta; QK variants also include theta_K, theta_Q |
|
| 35 |
+
| `config.json` | Training hyperparameters (d_hidden, base_lr, epochs, ...) |
|
| 36 |
+
| `lambdas.json` | LTT thresholds, keyed by delta |
|
| 37 |
+
| `metrics.json` | Step-level savings and error rate per delta |
|
| 38 |
+
| `ood_*.json` | Per-OOD-benchmark metrics (Qwen2.5-32B probes only) |
|
| 39 |
+
|
| 40 |
+
## Use
|
| 41 |
+
|
| 42 |
+
Probes are loaded by the `TTTProbe` class in https://github.com/wzekai99/ORCA. Quick example:
|
| 43 |
+
|
| 44 |
+
```bash
|
| 45 |
+
hf download wzekai99/ORCA --local-dir probes
|
| 46 |
+
hf download wzekai99/ORCA --repo-type dataset --local-dir data
|
| 47 |
+
python code/test.py \
|
| 48 |
+
--method ttt --no_kq \
|
| 49 |
+
--dataset_path data/qwen2.5-32b/s1k.pkl \
|
| 50 |
+
data/qwen2.5-32b/openr1_2k.pkl \
|
| 51 |
+
data/qwen2.5-32b/deepmath_2k.pkl \
|
| 52 |
+
--probe_path probes/qwen2.5-32b/supervised/no_kq/probe.pt \
|
| 53 |
+
--label_mode supervised --delta 0.1 --epsilon 0.05
|
| 54 |
+
```
|
| 55 |
+
|
| 56 |
+
## License
|
| 57 |
+
|
| 58 |
+
MIT.
|
| 59 |
+
|
| 60 |
+
## Citation
|
| 61 |
+
|
| 62 |
+
```bibtex
|
| 63 |
+
@article{zhou2026online,
|
| 64 |
+
title={Online Reasoning Calibration: Test-Time Training Enables Generalizable Conformal LLM Reasoning},
|
| 65 |
+
author={Zhou, Cai and Wang, Zekai and Wu, Menghua and Zhu, Qianyu Julie and Shi, Flora C and Wang, Chenyu and Wilson, Ashia and Jaakkola, Tommi and Bates, Stephen},
|
| 66 |
+
journal={arXiv preprint arXiv:2604.01170},
|
| 67 |
+
year={2026}
|
| 68 |
+
}
|
| 69 |
+
```
|
llama-3.3-70b/supervised/no_kq/config.json
ADDED
|
@@ -0,0 +1,43 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"config": "configs/llama70b_5k.yaml",
|
| 3 |
+
"method": "ttt",
|
| 4 |
+
"dataset_path": [
|
| 5 |
+
"data_prepare/output/llama70b/s1k/dataset.pkl",
|
| 6 |
+
"data_prepare/output/llama70b/openr1_2k/dataset.pkl",
|
| 7 |
+
"data_prepare/output/llama70b/deepmath_2k/dataset.pkl"
|
| 8 |
+
],
|
| 9 |
+
"ood_paths": [
|
| 10 |
+
"data_prepare/output/llama70b/aime24/dataset.pkl",
|
| 11 |
+
"data_prepare/output/llama70b/aime25/dataset.pkl",
|
| 12 |
+
"data_prepare/output/llama70b/aime26/dataset.pkl",
|
| 13 |
+
"data_prepare/output/llama70b/math500/dataset.pkl",
|
| 14 |
+
"data_prepare/output/llama70b/gpqa_diamond/dataset.pkl"
|
| 15 |
+
],
|
| 16 |
+
"output_dir": "results/llama70b_5k",
|
| 17 |
+
"label_mode": "supervised",
|
| 18 |
+
"batch_size": 10,
|
| 19 |
+
"seed": 42,
|
| 20 |
+
"smooth_window": 10,
|
| 21 |
+
"run_name": "ttt__no_kq__lr0.01__ep40",
|
| 22 |
+
"d_hidden": 64,
|
| 23 |
+
"use_ln": false,
|
| 24 |
+
"use_residual": false,
|
| 25 |
+
"learnable_eta": false,
|
| 26 |
+
"base_lr": 0.01,
|
| 27 |
+
"share_kq": false,
|
| 28 |
+
"use_mlp": false,
|
| 29 |
+
"use_pca": false,
|
| 30 |
+
"pca_dim": 256,
|
| 31 |
+
"epochs": 20,
|
| 32 |
+
"outer_lr": 0.001,
|
| 33 |
+
"no_meta_train": false,
|
| 34 |
+
"no_online_update": false,
|
| 35 |
+
"no_kq": true,
|
| 36 |
+
"grad_clip": 1.0,
|
| 37 |
+
"force_retrain": true,
|
| 38 |
+
"save_every": 10,
|
| 39 |
+
"d_phi": 8192,
|
| 40 |
+
"timestamp": "2026-03-30T01:32:49.432549",
|
| 41 |
+
"release_target": "llama-3.3-70b/supervised/no_kq",
|
| 42 |
+
"release_probe_source": "llama70b_5k/supervised/ttt__no_kq__lr0.01__ep40/checkpoints/probe_ep20.pt"
|
| 43 |
+
}
|
llama-3.3-70b/supervised/no_kq/lambdas.json
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": 0.9382,
|
| 3 |
+
"0.025": 0.9159,
|
| 4 |
+
"0.05": 0.8886000000000001,
|
| 5 |
+
"0.1": 0.8489,
|
| 6 |
+
"0.15": 0.8142,
|
| 7 |
+
"0.2": 0.7734,
|
| 8 |
+
"0.25": 0.7363,
|
| 9 |
+
"0.3": 0.7017,
|
| 10 |
+
"0.35": 0.6558999999999999,
|
| 11 |
+
"0.4": 0.5794,
|
| 12 |
+
"0.5": 9.999999999998899e-05
|
| 13 |
+
}
|
llama-3.3-70b/supervised/no_kq/metrics.json
ADDED
|
@@ -0,0 +1,70 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"eps_results": {
|
| 3 |
+
"0.01": {
|
| 4 |
+
"lambda": 0.9382,
|
| 5 |
+
"error_rate": 0.0086,
|
| 6 |
+
"savings": 0.052,
|
| 7 |
+
"accuracy": 0.9914
|
| 8 |
+
},
|
| 9 |
+
"0.025": {
|
| 10 |
+
"lambda": 0.9159,
|
| 11 |
+
"error_rate": 0.0235,
|
| 12 |
+
"savings": 0.1457,
|
| 13 |
+
"accuracy": 0.9765
|
| 14 |
+
},
|
| 15 |
+
"0.05": {
|
| 16 |
+
"lambda": 0.8886000000000001,
|
| 17 |
+
"error_rate": 0.046,
|
| 18 |
+
"savings": 0.2702,
|
| 19 |
+
"accuracy": 0.954
|
| 20 |
+
},
|
| 21 |
+
"0.1": {
|
| 22 |
+
"lambda": 0.8489,
|
| 23 |
+
"error_rate": 0.0898,
|
| 24 |
+
"savings": 0.4238,
|
| 25 |
+
"accuracy": 0.9102
|
| 26 |
+
},
|
| 27 |
+
"0.15": {
|
| 28 |
+
"lambda": 0.8142,
|
| 29 |
+
"error_rate": 0.1305,
|
| 30 |
+
"savings": 0.5281,
|
| 31 |
+
"accuracy": 0.8695
|
| 32 |
+
},
|
| 33 |
+
"0.2": {
|
| 34 |
+
"lambda": 0.7734,
|
| 35 |
+
"error_rate": 0.1861,
|
| 36 |
+
"savings": 0.6321,
|
| 37 |
+
"accuracy": 0.8139
|
| 38 |
+
},
|
| 39 |
+
"0.25": {
|
| 40 |
+
"lambda": 0.7363,
|
| 41 |
+
"error_rate": 0.2257,
|
| 42 |
+
"savings": 0.7091,
|
| 43 |
+
"accuracy": 0.7743
|
| 44 |
+
},
|
| 45 |
+
"0.3": {
|
| 46 |
+
"lambda": 0.7017,
|
| 47 |
+
"error_rate": 0.2717,
|
| 48 |
+
"savings": 0.7679,
|
| 49 |
+
"accuracy": 0.7283
|
| 50 |
+
},
|
| 51 |
+
"0.35": {
|
| 52 |
+
"lambda": 0.6558999999999999,
|
| 53 |
+
"error_rate": 0.323,
|
| 54 |
+
"savings": 0.834,
|
| 55 |
+
"accuracy": 0.677
|
| 56 |
+
},
|
| 57 |
+
"0.4": {
|
| 58 |
+
"lambda": 0.5794,
|
| 59 |
+
"error_rate": 0.3775,
|
| 60 |
+
"savings": 0.9036,
|
| 61 |
+
"accuracy": 0.6225
|
| 62 |
+
},
|
| 63 |
+
"0.5": {
|
| 64 |
+
"lambda": 9.999999999998899e-05,
|
| 65 |
+
"error_rate": 0.4075,
|
| 66 |
+
"savings": 0.9497,
|
| 67 |
+
"accuracy": 0.5925
|
| 68 |
+
}
|
| 69 |
+
}
|
| 70 |
+
}
|
llama-3.3-70b/supervised/no_kq/ood_aime24.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9382,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0024,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9159,
|
| 10 |
+
"error_rate": 0.0,
|
| 11 |
+
"savings": 0.0338,
|
| 12 |
+
"accuracy": 1.0
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.8886000000000001,
|
| 16 |
+
"error_rate": 0.0,
|
| 17 |
+
"savings": 0.0952,
|
| 18 |
+
"accuracy": 1.0
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8489,
|
| 22 |
+
"error_rate": 0.0435,
|
| 23 |
+
"savings": 0.2057,
|
| 24 |
+
"accuracy": 0.9565
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.8142,
|
| 28 |
+
"error_rate": 0.087,
|
| 29 |
+
"savings": 0.3153,
|
| 30 |
+
"accuracy": 0.913
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7734,
|
| 34 |
+
"error_rate": 0.2174,
|
| 35 |
+
"savings": 0.3871,
|
| 36 |
+
"accuracy": 0.7826
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.7363,
|
| 40 |
+
"error_rate": 0.2609,
|
| 41 |
+
"savings": 0.5131,
|
| 42 |
+
"accuracy": 0.7391
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.7017,
|
| 46 |
+
"error_rate": 0.3043,
|
| 47 |
+
"savings": 0.5721,
|
| 48 |
+
"accuracy": 0.6957
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.6558999999999999,
|
| 52 |
+
"error_rate": 0.3913,
|
| 53 |
+
"savings": 0.6936,
|
| 54 |
+
"accuracy": 0.6087
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.5794,
|
| 58 |
+
"error_rate": 0.4783,
|
| 59 |
+
"savings": 0.7992,
|
| 60 |
+
"accuracy": 0.5217
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.5217,
|
| 65 |
+
"savings": 0.9626,
|
| 66 |
+
"accuracy": 0.4783
|
| 67 |
+
}
|
| 68 |
+
}
|
llama-3.3-70b/supervised/no_kq/ood_aime25.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9382,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9159,
|
| 10 |
+
"error_rate": 0.0,
|
| 11 |
+
"savings": 0.0118,
|
| 12 |
+
"accuracy": 1.0
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.8886000000000001,
|
| 16 |
+
"error_rate": 0.0,
|
| 17 |
+
"savings": 0.1162,
|
| 18 |
+
"accuracy": 1.0
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8489,
|
| 22 |
+
"error_rate": 0.0476,
|
| 23 |
+
"savings": 0.2534,
|
| 24 |
+
"accuracy": 0.9524
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.8142,
|
| 28 |
+
"error_rate": 0.0952,
|
| 29 |
+
"savings": 0.3326,
|
| 30 |
+
"accuracy": 0.9048
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7734,
|
| 34 |
+
"error_rate": 0.2381,
|
| 35 |
+
"savings": 0.4854,
|
| 36 |
+
"accuracy": 0.7619
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.7363,
|
| 40 |
+
"error_rate": 0.2381,
|
| 41 |
+
"savings": 0.5396,
|
| 42 |
+
"accuracy": 0.7619
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.7017,
|
| 46 |
+
"error_rate": 0.3333,
|
| 47 |
+
"savings": 0.7042,
|
| 48 |
+
"accuracy": 0.6667
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.6558999999999999,
|
| 52 |
+
"error_rate": 0.4286,
|
| 53 |
+
"savings": 0.7611,
|
| 54 |
+
"accuracy": 0.5714
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.5794,
|
| 58 |
+
"error_rate": 0.6667,
|
| 59 |
+
"savings": 0.8989,
|
| 60 |
+
"accuracy": 0.3333
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.7619,
|
| 65 |
+
"savings": 0.9683,
|
| 66 |
+
"accuracy": 0.2381
|
| 67 |
+
}
|
| 68 |
+
}
|
llama-3.3-70b/supervised/no_kq/ood_aime26.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9382,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0131,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9159,
|
| 10 |
+
"error_rate": 0.0,
|
| 11 |
+
"savings": 0.0246,
|
| 12 |
+
"accuracy": 1.0
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.8886000000000001,
|
| 16 |
+
"error_rate": 0.0,
|
| 17 |
+
"savings": 0.0873,
|
| 18 |
+
"accuracy": 1.0
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8489,
|
| 22 |
+
"error_rate": 0.0385,
|
| 23 |
+
"savings": 0.2188,
|
| 24 |
+
"accuracy": 0.9615
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.8142,
|
| 28 |
+
"error_rate": 0.1154,
|
| 29 |
+
"savings": 0.3183,
|
| 30 |
+
"accuracy": 0.8846
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7734,
|
| 34 |
+
"error_rate": 0.2692,
|
| 35 |
+
"savings": 0.5766,
|
| 36 |
+
"accuracy": 0.7308
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.7363,
|
| 40 |
+
"error_rate": 0.3846,
|
| 41 |
+
"savings": 0.6703,
|
| 42 |
+
"accuracy": 0.6154
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.7017,
|
| 46 |
+
"error_rate": 0.4231,
|
| 47 |
+
"savings": 0.734,
|
| 48 |
+
"accuracy": 0.5769
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.6558999999999999,
|
| 52 |
+
"error_rate": 0.5385,
|
| 53 |
+
"savings": 0.8369,
|
| 54 |
+
"accuracy": 0.4615
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.5794,
|
| 58 |
+
"error_rate": 0.6154,
|
| 59 |
+
"savings": 0.9442,
|
| 60 |
+
"accuracy": 0.3846
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.6154,
|
| 65 |
+
"savings": 0.9686,
|
| 66 |
+
"accuracy": 0.3846
|
| 67 |
+
}
|
| 68 |
+
}
|
llama-3.3-70b/supervised/no_kq/ood_gpqa_diamond.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9382,
|
| 4 |
+
"error_rate": 0.0377,
|
| 5 |
+
"savings": 0.097,
|
| 6 |
+
"accuracy": 0.9623
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9159,
|
| 10 |
+
"error_rate": 0.0849,
|
| 11 |
+
"savings": 0.2106,
|
| 12 |
+
"accuracy": 0.9151
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.8886000000000001,
|
| 16 |
+
"error_rate": 0.1887,
|
| 17 |
+
"savings": 0.3912,
|
| 18 |
+
"accuracy": 0.8113
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8489,
|
| 22 |
+
"error_rate": 0.3491,
|
| 23 |
+
"savings": 0.6266,
|
| 24 |
+
"accuracy": 0.6509
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.8142,
|
| 28 |
+
"error_rate": 0.3868,
|
| 29 |
+
"savings": 0.7771,
|
| 30 |
+
"accuracy": 0.6132
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7734,
|
| 34 |
+
"error_rate": 0.4434,
|
| 35 |
+
"savings": 0.8936,
|
| 36 |
+
"accuracy": 0.5566
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.7363,
|
| 40 |
+
"error_rate": 0.4528,
|
| 41 |
+
"savings": 0.9361,
|
| 42 |
+
"accuracy": 0.5472
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.7017,
|
| 46 |
+
"error_rate": 0.4811,
|
| 47 |
+
"savings": 0.9536,
|
| 48 |
+
"accuracy": 0.5189
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.6558999999999999,
|
| 52 |
+
"error_rate": 0.4811,
|
| 53 |
+
"savings": 0.9657,
|
| 54 |
+
"accuracy": 0.5189
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.5794,
|
| 58 |
+
"error_rate": 0.4811,
|
| 59 |
+
"savings": 0.9695,
|
| 60 |
+
"accuracy": 0.5189
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.4811,
|
| 65 |
+
"savings": 0.9695,
|
| 66 |
+
"accuracy": 0.5189
|
| 67 |
+
}
|
| 68 |
+
}
|
llama-3.3-70b/supervised/no_kq/ood_math500.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9382,
|
| 4 |
+
"error_rate": 0.002,
|
| 5 |
+
"savings": 0.1291,
|
| 6 |
+
"accuracy": 0.998
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9159,
|
| 10 |
+
"error_rate": 0.0041,
|
| 11 |
+
"savings": 0.2712,
|
| 12 |
+
"accuracy": 0.9959
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.8886000000000001,
|
| 16 |
+
"error_rate": 0.0122,
|
| 17 |
+
"savings": 0.4343,
|
| 18 |
+
"accuracy": 0.9878
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8489,
|
| 22 |
+
"error_rate": 0.0265,
|
| 23 |
+
"savings": 0.599,
|
| 24 |
+
"accuracy": 0.9735
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.8142,
|
| 28 |
+
"error_rate": 0.0407,
|
| 29 |
+
"savings": 0.6907,
|
| 30 |
+
"accuracy": 0.9593
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7734,
|
| 34 |
+
"error_rate": 0.0713,
|
| 35 |
+
"savings": 0.7782,
|
| 36 |
+
"accuracy": 0.9287
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.7363,
|
| 40 |
+
"error_rate": 0.0774,
|
| 41 |
+
"savings": 0.8149,
|
| 42 |
+
"accuracy": 0.9226
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.7017,
|
| 46 |
+
"error_rate": 0.0957,
|
| 47 |
+
"savings": 0.8389,
|
| 48 |
+
"accuracy": 0.9043
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.6558999999999999,
|
| 52 |
+
"error_rate": 0.1079,
|
| 53 |
+
"savings": 0.8603,
|
| 54 |
+
"accuracy": 0.8921
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.5794,
|
| 58 |
+
"error_rate": 0.1161,
|
| 59 |
+
"savings": 0.8721,
|
| 60 |
+
"accuracy": 0.8839
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.1181,
|
| 65 |
+
"savings": 0.8764,
|
| 66 |
+
"accuracy": 0.8819
|
| 67 |
+
}
|
| 68 |
+
}
|
llama-3.3-70b/supervised/no_kq/probe.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a4fd23e7f353e515c4829282b8ff92f01ce1ea5b447da6219a2a42dea3b4af8f
|
| 3 |
+
size 34940
|
llama-3.3-70b/supervised/qk_dh128/config.json
ADDED
|
@@ -0,0 +1,43 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"config": "configs/llama70b_5k.yaml",
|
| 3 |
+
"method": "ttt",
|
| 4 |
+
"dataset_path": [
|
| 5 |
+
"data_prepare/output/llama70b/s1k/dataset.pkl",
|
| 6 |
+
"data_prepare/output/llama70b/openr1_2k/dataset.pkl",
|
| 7 |
+
"data_prepare/output/llama70b/deepmath_2k/dataset.pkl"
|
| 8 |
+
],
|
| 9 |
+
"ood_paths": [
|
| 10 |
+
"data_prepare/output/llama70b/aime24/dataset.pkl",
|
| 11 |
+
"data_prepare/output/llama70b/aime25/dataset.pkl",
|
| 12 |
+
"data_prepare/output/llama70b/aime26/dataset.pkl",
|
| 13 |
+
"data_prepare/output/llama70b/math500/dataset.pkl",
|
| 14 |
+
"data_prepare/output/llama70b/gpqa_diamond/dataset.pkl"
|
| 15 |
+
],
|
| 16 |
+
"output_dir": "results/llama70b_5k",
|
| 17 |
+
"label_mode": "supervised",
|
| 18 |
+
"batch_size": 10,
|
| 19 |
+
"seed": 42,
|
| 20 |
+
"smooth_window": 10,
|
| 21 |
+
"run_name": "ttt__dh128__lr0.01__ep40",
|
| 22 |
+
"d_hidden": 128,
|
| 23 |
+
"use_ln": false,
|
| 24 |
+
"use_residual": false,
|
| 25 |
+
"learnable_eta": false,
|
| 26 |
+
"base_lr": 0.01,
|
| 27 |
+
"share_kq": false,
|
| 28 |
+
"use_mlp": false,
|
| 29 |
+
"use_pca": false,
|
| 30 |
+
"pca_dim": 256,
|
| 31 |
+
"epochs": 10,
|
| 32 |
+
"outer_lr": 0.001,
|
| 33 |
+
"no_meta_train": false,
|
| 34 |
+
"no_online_update": false,
|
| 35 |
+
"no_kq": false,
|
| 36 |
+
"grad_clip": 1.0,
|
| 37 |
+
"force_retrain": true,
|
| 38 |
+
"save_every": 10,
|
| 39 |
+
"d_phi": 8192,
|
| 40 |
+
"timestamp": "2026-03-30T01:38:20.174996",
|
| 41 |
+
"release_target": "llama-3.3-70b/supervised/qk_dh128",
|
| 42 |
+
"release_probe_source": "llama70b_5k/supervised/ttt__dh128__lr0.01__final_ep10/probe.pt"
|
| 43 |
+
}
|
llama-3.3-70b/supervised/qk_dh128/lambdas.json
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": 0.9969,
|
| 3 |
+
"0.025": 0.9913,
|
| 4 |
+
"0.05": 0.9856,
|
| 5 |
+
"0.1": 0.971,
|
| 6 |
+
"0.15": 0.9573,
|
| 7 |
+
"0.2": 0.9441,
|
| 8 |
+
"0.25": 0.9275,
|
| 9 |
+
"0.3": 0.9108,
|
| 10 |
+
"0.35": 0.877,
|
| 11 |
+
"0.4": 0.8209,
|
| 12 |
+
"0.5": 9.999999999998899e-05
|
| 13 |
+
}
|
llama-3.3-70b/supervised/qk_dh128/metrics.json
ADDED
|
@@ -0,0 +1,70 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"eps_results": {
|
| 3 |
+
"0.01": {
|
| 4 |
+
"lambda": 0.9969,
|
| 5 |
+
"error_rate": 0.0053,
|
| 6 |
+
"savings": 0.0223,
|
| 7 |
+
"accuracy": 0.9947
|
| 8 |
+
},
|
| 9 |
+
"0.025": {
|
| 10 |
+
"lambda": 0.9913,
|
| 11 |
+
"error_rate": 0.016,
|
| 12 |
+
"savings": 0.0884,
|
| 13 |
+
"accuracy": 0.984
|
| 14 |
+
},
|
| 15 |
+
"0.05": {
|
| 16 |
+
"lambda": 0.9856,
|
| 17 |
+
"error_rate": 0.0385,
|
| 18 |
+
"savings": 0.1767,
|
| 19 |
+
"accuracy": 0.9615
|
| 20 |
+
},
|
| 21 |
+
"0.1": {
|
| 22 |
+
"lambda": 0.971,
|
| 23 |
+
"error_rate": 0.0813,
|
| 24 |
+
"savings": 0.378,
|
| 25 |
+
"accuracy": 0.9187
|
| 26 |
+
},
|
| 27 |
+
"0.15": {
|
| 28 |
+
"lambda": 0.9573,
|
| 29 |
+
"error_rate": 0.139,
|
| 30 |
+
"savings": 0.5199,
|
| 31 |
+
"accuracy": 0.861
|
| 32 |
+
},
|
| 33 |
+
"0.2": {
|
| 34 |
+
"lambda": 0.9441,
|
| 35 |
+
"error_rate": 0.1754,
|
| 36 |
+
"savings": 0.6083,
|
| 37 |
+
"accuracy": 0.8246
|
| 38 |
+
},
|
| 39 |
+
"0.25": {
|
| 40 |
+
"lambda": 0.9275,
|
| 41 |
+
"error_rate": 0.2235,
|
| 42 |
+
"savings": 0.7029,
|
| 43 |
+
"accuracy": 0.7765
|
| 44 |
+
},
|
| 45 |
+
"0.3": {
|
| 46 |
+
"lambda": 0.9108,
|
| 47 |
+
"error_rate": 0.2556,
|
| 48 |
+
"savings": 0.7558,
|
| 49 |
+
"accuracy": 0.7444
|
| 50 |
+
},
|
| 51 |
+
"0.35": {
|
| 52 |
+
"lambda": 0.877,
|
| 53 |
+
"error_rate": 0.3123,
|
| 54 |
+
"savings": 0.8364,
|
| 55 |
+
"accuracy": 0.6877
|
| 56 |
+
},
|
| 57 |
+
"0.4": {
|
| 58 |
+
"lambda": 0.8209,
|
| 59 |
+
"error_rate": 0.3679,
|
| 60 |
+
"savings": 0.9008,
|
| 61 |
+
"accuracy": 0.6321
|
| 62 |
+
},
|
| 63 |
+
"0.5": {
|
| 64 |
+
"lambda": 9.999999999998899e-05,
|
| 65 |
+
"error_rate": 0.4075,
|
| 66 |
+
"savings": 0.9497,
|
| 67 |
+
"accuracy": 0.5925
|
| 68 |
+
}
|
| 69 |
+
}
|
| 70 |
+
}
|
llama-3.3-70b/supervised/qk_dh128/ood_aime24.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9969,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9913,
|
| 10 |
+
"error_rate": 0.0,
|
| 11 |
+
"savings": 0.0245,
|
| 12 |
+
"accuracy": 1.0
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.9856,
|
| 16 |
+
"error_rate": 0.0,
|
| 17 |
+
"savings": 0.0647,
|
| 18 |
+
"accuracy": 1.0
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.971,
|
| 22 |
+
"error_rate": 0.087,
|
| 23 |
+
"savings": 0.1996,
|
| 24 |
+
"accuracy": 0.913
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.9573,
|
| 28 |
+
"error_rate": 0.1739,
|
| 29 |
+
"savings": 0.402,
|
| 30 |
+
"accuracy": 0.8261
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.9441,
|
| 34 |
+
"error_rate": 0.1739,
|
| 35 |
+
"savings": 0.4575,
|
| 36 |
+
"accuracy": 0.8261
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.9275,
|
| 40 |
+
"error_rate": 0.3478,
|
| 41 |
+
"savings": 0.5821,
|
| 42 |
+
"accuracy": 0.6522
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.9108,
|
| 46 |
+
"error_rate": 0.3913,
|
| 47 |
+
"savings": 0.7312,
|
| 48 |
+
"accuracy": 0.6087
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.877,
|
| 52 |
+
"error_rate": 0.4783,
|
| 53 |
+
"savings": 0.8874,
|
| 54 |
+
"accuracy": 0.5217
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.8209,
|
| 58 |
+
"error_rate": 0.5217,
|
| 59 |
+
"savings": 0.927,
|
| 60 |
+
"accuracy": 0.4783
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.5217,
|
| 65 |
+
"savings": 0.9626,
|
| 66 |
+
"accuracy": 0.4783
|
| 67 |
+
}
|
| 68 |
+
}
|
llama-3.3-70b/supervised/qk_dh128/ood_aime25.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9969,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9913,
|
| 10 |
+
"error_rate": 0.0,
|
| 11 |
+
"savings": 0.0292,
|
| 12 |
+
"accuracy": 1.0
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.9856,
|
| 16 |
+
"error_rate": 0.0,
|
| 17 |
+
"savings": 0.0833,
|
| 18 |
+
"accuracy": 1.0
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.971,
|
| 22 |
+
"error_rate": 0.0952,
|
| 23 |
+
"savings": 0.3089,
|
| 24 |
+
"accuracy": 0.9048
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.9573,
|
| 28 |
+
"error_rate": 0.1429,
|
| 29 |
+
"savings": 0.3788,
|
| 30 |
+
"accuracy": 0.8571
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.9441,
|
| 34 |
+
"error_rate": 0.1429,
|
| 35 |
+
"savings": 0.424,
|
| 36 |
+
"accuracy": 0.8571
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.9275,
|
| 40 |
+
"error_rate": 0.2857,
|
| 41 |
+
"savings": 0.5585,
|
| 42 |
+
"accuracy": 0.7143
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.9108,
|
| 46 |
+
"error_rate": 0.381,
|
| 47 |
+
"savings": 0.6373,
|
| 48 |
+
"accuracy": 0.619
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.877,
|
| 52 |
+
"error_rate": 0.5238,
|
| 53 |
+
"savings": 0.7687,
|
| 54 |
+
"accuracy": 0.4762
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.8209,
|
| 58 |
+
"error_rate": 0.6667,
|
| 59 |
+
"savings": 0.9211,
|
| 60 |
+
"accuracy": 0.3333
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.7619,
|
| 65 |
+
"savings": 0.9683,
|
| 66 |
+
"accuracy": 0.2381
|
| 67 |
+
}
|
| 68 |
+
}
|
llama-3.3-70b/supervised/qk_dh128/ood_aime26.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9969,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.017,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9913,
|
| 10 |
+
"error_rate": 0.0,
|
| 11 |
+
"savings": 0.0263,
|
| 12 |
+
"accuracy": 1.0
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.9856,
|
| 16 |
+
"error_rate": 0.0385,
|
| 17 |
+
"savings": 0.0995,
|
| 18 |
+
"accuracy": 0.9615
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.971,
|
| 22 |
+
"error_rate": 0.1154,
|
| 23 |
+
"savings": 0.3059,
|
| 24 |
+
"accuracy": 0.8846
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.9573,
|
| 28 |
+
"error_rate": 0.2692,
|
| 29 |
+
"savings": 0.5312,
|
| 30 |
+
"accuracy": 0.7308
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.9441,
|
| 34 |
+
"error_rate": 0.3077,
|
| 35 |
+
"savings": 0.5872,
|
| 36 |
+
"accuracy": 0.6923
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.9275,
|
| 40 |
+
"error_rate": 0.3462,
|
| 41 |
+
"savings": 0.6452,
|
| 42 |
+
"accuracy": 0.6538
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.9108,
|
| 46 |
+
"error_rate": 0.4231,
|
| 47 |
+
"savings": 0.6927,
|
| 48 |
+
"accuracy": 0.5769
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.877,
|
| 52 |
+
"error_rate": 0.5385,
|
| 53 |
+
"savings": 0.8578,
|
| 54 |
+
"accuracy": 0.4615
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.8209,
|
| 58 |
+
"error_rate": 0.5769,
|
| 59 |
+
"savings": 0.904,
|
| 60 |
+
"accuracy": 0.4231
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.6154,
|
| 65 |
+
"savings": 0.9686,
|
| 66 |
+
"accuracy": 0.3846
|
| 67 |
+
}
|
| 68 |
+
}
|
llama-3.3-70b/supervised/qk_dh128/ood_gpqa_diamond.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9969,
|
| 4 |
+
"error_rate": 0.0849,
|
| 5 |
+
"savings": 0.1121,
|
| 6 |
+
"accuracy": 0.9151
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9913,
|
| 10 |
+
"error_rate": 0.1321,
|
| 11 |
+
"savings": 0.271,
|
| 12 |
+
"accuracy": 0.8679
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.9856,
|
| 16 |
+
"error_rate": 0.1887,
|
| 17 |
+
"savings": 0.3944,
|
| 18 |
+
"accuracy": 0.8113
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.971,
|
| 22 |
+
"error_rate": 0.2925,
|
| 23 |
+
"savings": 0.5771,
|
| 24 |
+
"accuracy": 0.7075
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.9573,
|
| 28 |
+
"error_rate": 0.3774,
|
| 29 |
+
"savings": 0.7035,
|
| 30 |
+
"accuracy": 0.6226
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.9441,
|
| 34 |
+
"error_rate": 0.3962,
|
| 35 |
+
"savings": 0.7595,
|
| 36 |
+
"accuracy": 0.6038
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.9275,
|
| 40 |
+
"error_rate": 0.434,
|
| 41 |
+
"savings": 0.8436,
|
| 42 |
+
"accuracy": 0.566
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.9108,
|
| 46 |
+
"error_rate": 0.434,
|
| 47 |
+
"savings": 0.8973,
|
| 48 |
+
"accuracy": 0.566
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.877,
|
| 52 |
+
"error_rate": 0.4623,
|
| 53 |
+
"savings": 0.9408,
|
| 54 |
+
"accuracy": 0.5377
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.8209,
|
| 58 |
+
"error_rate": 0.4811,
|
| 59 |
+
"savings": 0.9649,
|
| 60 |
+
"accuracy": 0.5189
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.4811,
|
| 65 |
+
"savings": 0.9695,
|
| 66 |
+
"accuracy": 0.5189
|
| 67 |
+
}
|
| 68 |
+
}
|
llama-3.3-70b/supervised/qk_dh128/ood_math500.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9969,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0922,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9913,
|
| 10 |
+
"error_rate": 0.0081,
|
| 11 |
+
"savings": 0.3619,
|
| 12 |
+
"accuracy": 0.9919
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.9856,
|
| 16 |
+
"error_rate": 0.0224,
|
| 17 |
+
"savings": 0.5167,
|
| 18 |
+
"accuracy": 0.9776
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.971,
|
| 22 |
+
"error_rate": 0.0387,
|
| 23 |
+
"savings": 0.6876,
|
| 24 |
+
"accuracy": 0.9613
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.9573,
|
| 28 |
+
"error_rate": 0.0591,
|
| 29 |
+
"savings": 0.7632,
|
| 30 |
+
"accuracy": 0.9409
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.9441,
|
| 34 |
+
"error_rate": 0.0815,
|
| 35 |
+
"savings": 0.8065,
|
| 36 |
+
"accuracy": 0.9185
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.9275,
|
| 40 |
+
"error_rate": 0.0916,
|
| 41 |
+
"savings": 0.8372,
|
| 42 |
+
"accuracy": 0.9084
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.9108,
|
| 46 |
+
"error_rate": 0.1059,
|
| 47 |
+
"savings": 0.8582,
|
| 48 |
+
"accuracy": 0.8941
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.877,
|
| 52 |
+
"error_rate": 0.1141,
|
| 53 |
+
"savings": 0.8713,
|
| 54 |
+
"accuracy": 0.8859
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.8209,
|
| 58 |
+
"error_rate": 0.1181,
|
| 59 |
+
"savings": 0.8756,
|
| 60 |
+
"accuracy": 0.8819
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.1181,
|
| 65 |
+
"savings": 0.8764,
|
| 66 |
+
"accuracy": 0.8819
|
| 67 |
+
}
|
| 68 |
+
}
|
llama-3.3-70b/supervised/qk_dh128/probe.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:47ca9fb1c6798e4dabe12ac3e18522ca814000b719b5fa63e8624df5808c4268
|
| 3 |
+
size 8391930
|
qwen2.5-32b/consistent/no_kq/config.json
ADDED
|
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"config": "configs/qwen32b_5k.yaml",
|
| 3 |
+
"method": "ttt",
|
| 4 |
+
"dataset_path": [
|
| 5 |
+
"data_prepare/output/qwen32b/s1k/dataset.pkl",
|
| 6 |
+
"data_prepare/output/qwen32b/openr1_2k/dataset.pkl",
|
| 7 |
+
"data_prepare/output/qwen32b/deepmath_2k/dataset.pkl"
|
| 8 |
+
],
|
| 9 |
+
"ood_paths": [
|
| 10 |
+
"data_prepare/output/qwen32b/aime24/dataset.pkl",
|
| 11 |
+
"data_prepare/output/qwen32b/aime25/dataset.pkl",
|
| 12 |
+
"data_prepare/output/qwen32b/aime26/dataset.pkl",
|
| 13 |
+
"data_prepare/output/qwen32b/math500/dataset.pkl",
|
| 14 |
+
"data_prepare/output/qwen32b/gpqa_diamond/dataset.pkl"
|
| 15 |
+
],
|
| 16 |
+
"output_dir": "results/qwen32b_5k",
|
| 17 |
+
"label_mode": "consistent",
|
| 18 |
+
"batch_size": 10,
|
| 19 |
+
"seed": 42,
|
| 20 |
+
"smooth_window": 10,
|
| 21 |
+
"run_name": "ttt__no_kq__lr0.01",
|
| 22 |
+
"d_hidden": 64,
|
| 23 |
+
"use_ln": false,
|
| 24 |
+
"use_residual": false,
|
| 25 |
+
"learnable_eta": false,
|
| 26 |
+
"base_lr": 0.01,
|
| 27 |
+
"share_kq": false,
|
| 28 |
+
"use_mlp": false,
|
| 29 |
+
"use_pca": false,
|
| 30 |
+
"pca_dim": 256,
|
| 31 |
+
"epochs": 20,
|
| 32 |
+
"outer_lr": 0.001,
|
| 33 |
+
"no_meta_train": false,
|
| 34 |
+
"no_online_update": false,
|
| 35 |
+
"no_kq": true,
|
| 36 |
+
"grad_clip": 1.0,
|
| 37 |
+
"force_retrain": false,
|
| 38 |
+
"d_phi": 5120,
|
| 39 |
+
"timestamp": "2026-03-27T22:40:13.431109",
|
| 40 |
+
"release_target": "qwen2.5-32b/consistent/no_kq",
|
| 41 |
+
"release_probe_source": "qwen32b_5k/consistent/ttt__no_kq__lr0.01/checkpoints/probe_ep20.pt"
|
| 42 |
+
}
|
qwen2.5-32b/consistent/no_kq/lambdas.json
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": 0.9555,
|
| 3 |
+
"0.025": 0.9279,
|
| 4 |
+
"0.05": 0.9062,
|
| 5 |
+
"0.1": 0.8543000000000001,
|
| 6 |
+
"0.15": 0.8158,
|
| 7 |
+
"0.2": 0.7741,
|
| 8 |
+
"0.25": 0.7341,
|
| 9 |
+
"0.3": 0.6795,
|
| 10 |
+
"0.35": 0.6321,
|
| 11 |
+
"0.4": 0.5152,
|
| 12 |
+
"0.5": 9.999999999998899e-05
|
| 13 |
+
}
|
qwen2.5-32b/consistent/no_kq/metrics.json
ADDED
|
@@ -0,0 +1,70 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"eps_results": {
|
| 3 |
+
"0.01": {
|
| 4 |
+
"lambda": 0.9555,
|
| 5 |
+
"error_rate": 0.011,
|
| 6 |
+
"savings": 0.0213,
|
| 7 |
+
"accuracy": 0.989
|
| 8 |
+
},
|
| 9 |
+
"0.025": {
|
| 10 |
+
"lambda": 0.9279,
|
| 11 |
+
"error_rate": 0.024,
|
| 12 |
+
"savings": 0.124,
|
| 13 |
+
"accuracy": 0.976
|
| 14 |
+
},
|
| 15 |
+
"0.05": {
|
| 16 |
+
"lambda": 0.9062,
|
| 17 |
+
"error_rate": 0.045,
|
| 18 |
+
"savings": 0.2197,
|
| 19 |
+
"accuracy": 0.955
|
| 20 |
+
},
|
| 21 |
+
"0.1": {
|
| 22 |
+
"lambda": 0.8543000000000001,
|
| 23 |
+
"error_rate": 0.096,
|
| 24 |
+
"savings": 0.4073,
|
| 25 |
+
"accuracy": 0.904
|
| 26 |
+
},
|
| 27 |
+
"0.15": {
|
| 28 |
+
"lambda": 0.8158,
|
| 29 |
+
"error_rate": 0.141,
|
| 30 |
+
"savings": 0.5292,
|
| 31 |
+
"accuracy": 0.859
|
| 32 |
+
},
|
| 33 |
+
"0.2": {
|
| 34 |
+
"lambda": 0.7741,
|
| 35 |
+
"error_rate": 0.193,
|
| 36 |
+
"savings": 0.6441,
|
| 37 |
+
"accuracy": 0.807
|
| 38 |
+
},
|
| 39 |
+
"0.25": {
|
| 40 |
+
"lambda": 0.7341,
|
| 41 |
+
"error_rate": 0.234,
|
| 42 |
+
"savings": 0.7307,
|
| 43 |
+
"accuracy": 0.766
|
| 44 |
+
},
|
| 45 |
+
"0.3": {
|
| 46 |
+
"lambda": 0.6795,
|
| 47 |
+
"error_rate": 0.296,
|
| 48 |
+
"savings": 0.8146,
|
| 49 |
+
"accuracy": 0.704
|
| 50 |
+
},
|
| 51 |
+
"0.35": {
|
| 52 |
+
"lambda": 0.6321,
|
| 53 |
+
"error_rate": 0.331,
|
| 54 |
+
"savings": 0.8668,
|
| 55 |
+
"accuracy": 0.669
|
| 56 |
+
},
|
| 57 |
+
"0.4": {
|
| 58 |
+
"lambda": 0.5152,
|
| 59 |
+
"error_rate": 0.371,
|
| 60 |
+
"savings": 0.9334,
|
| 61 |
+
"accuracy": 0.629
|
| 62 |
+
},
|
| 63 |
+
"0.5": {
|
| 64 |
+
"lambda": 9.999999999998899e-05,
|
| 65 |
+
"error_rate": 0.382,
|
| 66 |
+
"savings": 0.9522,
|
| 67 |
+
"accuracy": 0.618
|
| 68 |
+
}
|
| 69 |
+
}
|
| 70 |
+
}
|
qwen2.5-32b/consistent/no_kq/ood_aime24.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9555,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9279,
|
| 10 |
+
"error_rate": 0.0333,
|
| 11 |
+
"savings": 0.0354,
|
| 12 |
+
"accuracy": 0.9667
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.9062,
|
| 16 |
+
"error_rate": 0.0333,
|
| 17 |
+
"savings": 0.0462,
|
| 18 |
+
"accuracy": 0.9667
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8543000000000001,
|
| 22 |
+
"error_rate": 0.0333,
|
| 23 |
+
"savings": 0.1406,
|
| 24 |
+
"accuracy": 0.9667
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.8158,
|
| 28 |
+
"error_rate": 0.0333,
|
| 29 |
+
"savings": 0.263,
|
| 30 |
+
"accuracy": 0.9667
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7741,
|
| 34 |
+
"error_rate": 0.1,
|
| 35 |
+
"savings": 0.4018,
|
| 36 |
+
"accuracy": 0.9
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.7341,
|
| 40 |
+
"error_rate": 0.2667,
|
| 41 |
+
"savings": 0.5115,
|
| 42 |
+
"accuracy": 0.7333
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.6795,
|
| 46 |
+
"error_rate": 0.3333,
|
| 47 |
+
"savings": 0.7286,
|
| 48 |
+
"accuracy": 0.6667
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.6321,
|
| 52 |
+
"error_rate": 0.4333,
|
| 53 |
+
"savings": 0.8066,
|
| 54 |
+
"accuracy": 0.5667
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.5152,
|
| 58 |
+
"error_rate": 0.4667,
|
| 59 |
+
"savings": 0.945,
|
| 60 |
+
"accuracy": 0.5333
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.4667,
|
| 65 |
+
"savings": 0.9702,
|
| 66 |
+
"accuracy": 0.5333
|
| 67 |
+
}
|
| 68 |
+
}
|
qwen2.5-32b/consistent/no_kq/ood_aime25.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9555,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9279,
|
| 10 |
+
"error_rate": 0.0,
|
| 11 |
+
"savings": 0.0151,
|
| 12 |
+
"accuracy": 1.0
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.9062,
|
| 16 |
+
"error_rate": 0.0,
|
| 17 |
+
"savings": 0.0186,
|
| 18 |
+
"accuracy": 1.0
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8543000000000001,
|
| 22 |
+
"error_rate": 0.0667,
|
| 23 |
+
"savings": 0.1661,
|
| 24 |
+
"accuracy": 0.9333
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.8158,
|
| 28 |
+
"error_rate": 0.0667,
|
| 29 |
+
"savings": 0.2264,
|
| 30 |
+
"accuracy": 0.9333
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7741,
|
| 34 |
+
"error_rate": 0.1667,
|
| 35 |
+
"savings": 0.3693,
|
| 36 |
+
"accuracy": 0.8333
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.7341,
|
| 40 |
+
"error_rate": 0.3,
|
| 41 |
+
"savings": 0.5924,
|
| 42 |
+
"accuracy": 0.7
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.6795,
|
| 46 |
+
"error_rate": 0.3333,
|
| 47 |
+
"savings": 0.7102,
|
| 48 |
+
"accuracy": 0.6667
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.6321,
|
| 52 |
+
"error_rate": 0.4333,
|
| 53 |
+
"savings": 0.8036,
|
| 54 |
+
"accuracy": 0.5667
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.5152,
|
| 58 |
+
"error_rate": 0.5333,
|
| 59 |
+
"savings": 0.9255,
|
| 60 |
+
"accuracy": 0.4667
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.6,
|
| 65 |
+
"savings": 0.9647,
|
| 66 |
+
"accuracy": 0.4
|
| 67 |
+
}
|
| 68 |
+
}
|
qwen2.5-32b/consistent/no_kq/ood_aime26.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9555,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0144,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9279,
|
| 10 |
+
"error_rate": 0.0,
|
| 11 |
+
"savings": 0.0289,
|
| 12 |
+
"accuracy": 1.0
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.9062,
|
| 16 |
+
"error_rate": 0.0,
|
| 17 |
+
"savings": 0.0498,
|
| 18 |
+
"accuracy": 1.0
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8543000000000001,
|
| 22 |
+
"error_rate": 0.0667,
|
| 23 |
+
"savings": 0.1544,
|
| 24 |
+
"accuracy": 0.9333
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.8158,
|
| 28 |
+
"error_rate": 0.1,
|
| 29 |
+
"savings": 0.2449,
|
| 30 |
+
"accuracy": 0.9
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7741,
|
| 34 |
+
"error_rate": 0.1333,
|
| 35 |
+
"savings": 0.3388,
|
| 36 |
+
"accuracy": 0.8667
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.7341,
|
| 40 |
+
"error_rate": 0.3,
|
| 41 |
+
"savings": 0.5093,
|
| 42 |
+
"accuracy": 0.7
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.6795,
|
| 46 |
+
"error_rate": 0.3333,
|
| 47 |
+
"savings": 0.6242,
|
| 48 |
+
"accuracy": 0.6667
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.6321,
|
| 52 |
+
"error_rate": 0.3333,
|
| 53 |
+
"savings": 0.6997,
|
| 54 |
+
"accuracy": 0.6667
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.5152,
|
| 58 |
+
"error_rate": 0.4667,
|
| 59 |
+
"savings": 0.8829,
|
| 60 |
+
"accuracy": 0.5333
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.5333,
|
| 65 |
+
"savings": 0.9675,
|
| 66 |
+
"accuracy": 0.4667
|
| 67 |
+
}
|
| 68 |
+
}
|
qwen2.5-32b/consistent/no_kq/ood_gpqa_diamond.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9555,
|
| 4 |
+
"error_rate": 0.0101,
|
| 5 |
+
"savings": 0.0457,
|
| 6 |
+
"accuracy": 0.9899
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9279,
|
| 10 |
+
"error_rate": 0.101,
|
| 11 |
+
"savings": 0.209,
|
| 12 |
+
"accuracy": 0.899
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.9062,
|
| 16 |
+
"error_rate": 0.1667,
|
| 17 |
+
"savings": 0.3483,
|
| 18 |
+
"accuracy": 0.8333
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8543000000000001,
|
| 22 |
+
"error_rate": 0.3182,
|
| 23 |
+
"savings": 0.5983,
|
| 24 |
+
"accuracy": 0.6818
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.8158,
|
| 28 |
+
"error_rate": 0.399,
|
| 29 |
+
"savings": 0.734,
|
| 30 |
+
"accuracy": 0.601
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7741,
|
| 34 |
+
"error_rate": 0.4495,
|
| 35 |
+
"savings": 0.839,
|
| 36 |
+
"accuracy": 0.5505
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.7341,
|
| 40 |
+
"error_rate": 0.4697,
|
| 41 |
+
"savings": 0.8911,
|
| 42 |
+
"accuracy": 0.5303
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.6795,
|
| 46 |
+
"error_rate": 0.4949,
|
| 47 |
+
"savings": 0.9306,
|
| 48 |
+
"accuracy": 0.5051
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.6321,
|
| 52 |
+
"error_rate": 0.5101,
|
| 53 |
+
"savings": 0.9449,
|
| 54 |
+
"accuracy": 0.4899
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.5152,
|
| 58 |
+
"error_rate": 0.5101,
|
| 59 |
+
"savings": 0.9596,
|
| 60 |
+
"accuracy": 0.4899
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.5101,
|
| 65 |
+
"savings": 0.9614,
|
| 66 |
+
"accuracy": 0.4899
|
| 67 |
+
}
|
| 68 |
+
}
|
qwen2.5-32b/consistent/no_kq/ood_math500.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9555,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0352,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9279,
|
| 10 |
+
"error_rate": 0.0,
|
| 11 |
+
"savings": 0.1602,
|
| 12 |
+
"accuracy": 1.0
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.9062,
|
| 16 |
+
"error_rate": 0.0,
|
| 17 |
+
"savings": 0.2828,
|
| 18 |
+
"accuracy": 1.0
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8543000000000001,
|
| 22 |
+
"error_rate": 0.012,
|
| 23 |
+
"savings": 0.5554,
|
| 24 |
+
"accuracy": 0.988
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.8158,
|
| 28 |
+
"error_rate": 0.026,
|
| 29 |
+
"savings": 0.6714,
|
| 30 |
+
"accuracy": 0.974
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7741,
|
| 34 |
+
"error_rate": 0.038,
|
| 35 |
+
"savings": 0.7488,
|
| 36 |
+
"accuracy": 0.962
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.7341,
|
| 40 |
+
"error_rate": 0.052,
|
| 41 |
+
"savings": 0.7962,
|
| 42 |
+
"accuracy": 0.948
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.6795,
|
| 46 |
+
"error_rate": 0.072,
|
| 47 |
+
"savings": 0.8429,
|
| 48 |
+
"accuracy": 0.928
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.6321,
|
| 52 |
+
"error_rate": 0.08,
|
| 53 |
+
"savings": 0.8647,
|
| 54 |
+
"accuracy": 0.92
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.5152,
|
| 58 |
+
"error_rate": 0.094,
|
| 59 |
+
"savings": 0.8833,
|
| 60 |
+
"accuracy": 0.906
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.1,
|
| 65 |
+
"savings": 0.8907,
|
| 66 |
+
"accuracy": 0.9
|
| 67 |
+
}
|
| 68 |
+
}
|
qwen2.5-32b/consistent/no_kq/probe.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7b71239aef69766c054f887fd49c714b68638c5810173f4bda9abc0c99877f31
|
| 3 |
+
size 22652
|
qwen2.5-32b/consistent/qk_dh128/config.json
ADDED
|
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"config": "configs/qwen32b_5k.yaml",
|
| 3 |
+
"method": "ttt",
|
| 4 |
+
"dataset_path": [
|
| 5 |
+
"data_prepare/output/qwen32b/s1k/dataset.pkl",
|
| 6 |
+
"data_prepare/output/qwen32b/openr1_2k/dataset.pkl",
|
| 7 |
+
"data_prepare/output/qwen32b/deepmath_2k/dataset.pkl"
|
| 8 |
+
],
|
| 9 |
+
"ood_paths": [
|
| 10 |
+
"data_prepare/output/qwen32b/aime24/dataset.pkl",
|
| 11 |
+
"data_prepare/output/qwen32b/aime25/dataset.pkl",
|
| 12 |
+
"data_prepare/output/qwen32b/aime26/dataset.pkl",
|
| 13 |
+
"data_prepare/output/qwen32b/math500/dataset.pkl",
|
| 14 |
+
"data_prepare/output/qwen32b/gpqa_diamond/dataset.pkl"
|
| 15 |
+
],
|
| 16 |
+
"output_dir": "results/qwen32b_5k",
|
| 17 |
+
"label_mode": "consistent",
|
| 18 |
+
"batch_size": 10,
|
| 19 |
+
"seed": 42,
|
| 20 |
+
"smooth_window": 10,
|
| 21 |
+
"run_name": "ttt__dh128__lr0.01",
|
| 22 |
+
"d_hidden": 128,
|
| 23 |
+
"use_ln": false,
|
| 24 |
+
"use_residual": false,
|
| 25 |
+
"learnable_eta": false,
|
| 26 |
+
"base_lr": 0.01,
|
| 27 |
+
"share_kq": false,
|
| 28 |
+
"use_mlp": false,
|
| 29 |
+
"use_pca": false,
|
| 30 |
+
"pca_dim": 256,
|
| 31 |
+
"epochs": 10,
|
| 32 |
+
"outer_lr": 0.001,
|
| 33 |
+
"no_meta_train": false,
|
| 34 |
+
"no_online_update": false,
|
| 35 |
+
"no_kq": false,
|
| 36 |
+
"grad_clip": 1.0,
|
| 37 |
+
"force_retrain": false,
|
| 38 |
+
"d_phi": 5120,
|
| 39 |
+
"timestamp": "2026-03-28T01:01:45.669043",
|
| 40 |
+
"release_target": "qwen2.5-32b/consistent/qk_dh128",
|
| 41 |
+
"release_probe_source": "qwen32b_5k/consistent/ttt__dh128__lr0.01/checkpoints/probe_ep10.pt"
|
| 42 |
+
}
|
qwen2.5-32b/consistent/qk_dh128/lambdas.json
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": 0.9921,
|
| 3 |
+
"0.025": 0.9767,
|
| 4 |
+
"0.05": 0.9482,
|
| 5 |
+
"0.1": 0.8952,
|
| 6 |
+
"0.15": 0.8351,
|
| 7 |
+
"0.2": 0.7674,
|
| 8 |
+
"0.25": 0.6921999999999999,
|
| 9 |
+
"0.3": 0.5946,
|
| 10 |
+
"0.35": 0.4928,
|
| 11 |
+
"0.4": 0.32909999999999995,
|
| 12 |
+
"0.5": 9.999999999998899e-05
|
| 13 |
+
}
|
qwen2.5-32b/consistent/qk_dh128/metrics.json
ADDED
|
@@ -0,0 +1,70 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"eps_results": {
|
| 3 |
+
"0.01": {
|
| 4 |
+
"lambda": 0.9921,
|
| 5 |
+
"error_rate": 0.009,
|
| 6 |
+
"savings": 0.0207,
|
| 7 |
+
"accuracy": 0.991
|
| 8 |
+
},
|
| 9 |
+
"0.025": {
|
| 10 |
+
"lambda": 0.9767,
|
| 11 |
+
"error_rate": 0.033,
|
| 12 |
+
"savings": 0.0935,
|
| 13 |
+
"accuracy": 0.967
|
| 14 |
+
},
|
| 15 |
+
"0.05": {
|
| 16 |
+
"lambda": 0.9482,
|
| 17 |
+
"error_rate": 0.064,
|
| 18 |
+
"savings": 0.2315,
|
| 19 |
+
"accuracy": 0.936
|
| 20 |
+
},
|
| 21 |
+
"0.1": {
|
| 22 |
+
"lambda": 0.8952,
|
| 23 |
+
"error_rate": 0.113,
|
| 24 |
+
"savings": 0.3971,
|
| 25 |
+
"accuracy": 0.887
|
| 26 |
+
},
|
| 27 |
+
"0.15": {
|
| 28 |
+
"lambda": 0.8351,
|
| 29 |
+
"error_rate": 0.15,
|
| 30 |
+
"savings": 0.5236,
|
| 31 |
+
"accuracy": 0.85
|
| 32 |
+
},
|
| 33 |
+
"0.2": {
|
| 34 |
+
"lambda": 0.7674,
|
| 35 |
+
"error_rate": 0.187,
|
| 36 |
+
"savings": 0.6288,
|
| 37 |
+
"accuracy": 0.813
|
| 38 |
+
},
|
| 39 |
+
"0.25": {
|
| 40 |
+
"lambda": 0.6921999999999999,
|
| 41 |
+
"error_rate": 0.227,
|
| 42 |
+
"savings": 0.7114,
|
| 43 |
+
"accuracy": 0.773
|
| 44 |
+
},
|
| 45 |
+
"0.3": {
|
| 46 |
+
"lambda": 0.5946,
|
| 47 |
+
"error_rate": 0.28,
|
| 48 |
+
"savings": 0.8033,
|
| 49 |
+
"accuracy": 0.72
|
| 50 |
+
},
|
| 51 |
+
"0.35": {
|
| 52 |
+
"lambda": 0.4928,
|
| 53 |
+
"error_rate": 0.323,
|
| 54 |
+
"savings": 0.8698,
|
| 55 |
+
"accuracy": 0.677
|
| 56 |
+
},
|
| 57 |
+
"0.4": {
|
| 58 |
+
"lambda": 0.32909999999999995,
|
| 59 |
+
"error_rate": 0.364,
|
| 60 |
+
"savings": 0.9308,
|
| 61 |
+
"accuracy": 0.636
|
| 62 |
+
},
|
| 63 |
+
"0.5": {
|
| 64 |
+
"lambda": 9.999999999998899e-05,
|
| 65 |
+
"error_rate": 0.382,
|
| 66 |
+
"savings": 0.9522,
|
| 67 |
+
"accuracy": 0.618
|
| 68 |
+
}
|
| 69 |
+
}
|
| 70 |
+
}
|
qwen2.5-32b/consistent/qk_dh128/ood_aime24.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9921,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9767,
|
| 10 |
+
"error_rate": 0.0,
|
| 11 |
+
"savings": 0.0527,
|
| 12 |
+
"accuracy": 1.0
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.9482,
|
| 16 |
+
"error_rate": 0.0333,
|
| 17 |
+
"savings": 0.0913,
|
| 18 |
+
"accuracy": 0.9667
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8952,
|
| 22 |
+
"error_rate": 0.0333,
|
| 23 |
+
"savings": 0.1847,
|
| 24 |
+
"accuracy": 0.9667
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.8351,
|
| 28 |
+
"error_rate": 0.0333,
|
| 29 |
+
"savings": 0.303,
|
| 30 |
+
"accuracy": 0.9667
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7674,
|
| 34 |
+
"error_rate": 0.1667,
|
| 35 |
+
"savings": 0.3927,
|
| 36 |
+
"accuracy": 0.8333
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.6921999999999999,
|
| 40 |
+
"error_rate": 0.3,
|
| 41 |
+
"savings": 0.5937,
|
| 42 |
+
"accuracy": 0.7
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.5946,
|
| 46 |
+
"error_rate": 0.3333,
|
| 47 |
+
"savings": 0.6923,
|
| 48 |
+
"accuracy": 0.6667
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.4928,
|
| 52 |
+
"error_rate": 0.4,
|
| 53 |
+
"savings": 0.8047,
|
| 54 |
+
"accuracy": 0.6
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.32909999999999995,
|
| 58 |
+
"error_rate": 0.4667,
|
| 59 |
+
"savings": 0.9325,
|
| 60 |
+
"accuracy": 0.5333
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.4667,
|
| 65 |
+
"savings": 0.9702,
|
| 66 |
+
"accuracy": 0.5333
|
| 67 |
+
}
|
| 68 |
+
}
|
qwen2.5-32b/consistent/qk_dh128/ood_aime25.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9921,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0028,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9767,
|
| 10 |
+
"error_rate": 0.0,
|
| 11 |
+
"savings": 0.0353,
|
| 12 |
+
"accuracy": 1.0
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.9482,
|
| 16 |
+
"error_rate": 0.0,
|
| 17 |
+
"savings": 0.0536,
|
| 18 |
+
"accuracy": 1.0
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8952,
|
| 22 |
+
"error_rate": 0.0,
|
| 23 |
+
"savings": 0.1389,
|
| 24 |
+
"accuracy": 1.0
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.8351,
|
| 28 |
+
"error_rate": 0.0333,
|
| 29 |
+
"savings": 0.2236,
|
| 30 |
+
"accuracy": 0.9667
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7674,
|
| 34 |
+
"error_rate": 0.1333,
|
| 35 |
+
"savings": 0.3198,
|
| 36 |
+
"accuracy": 0.8667
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.6921999999999999,
|
| 40 |
+
"error_rate": 0.1667,
|
| 41 |
+
"savings": 0.4304,
|
| 42 |
+
"accuracy": 0.8333
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.5946,
|
| 46 |
+
"error_rate": 0.2,
|
| 47 |
+
"savings": 0.5998,
|
| 48 |
+
"accuracy": 0.8
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.4928,
|
| 52 |
+
"error_rate": 0.3333,
|
| 53 |
+
"savings": 0.7807,
|
| 54 |
+
"accuracy": 0.6667
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.32909999999999995,
|
| 58 |
+
"error_rate": 0.5667,
|
| 59 |
+
"savings": 0.9402,
|
| 60 |
+
"accuracy": 0.4333
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.6,
|
| 65 |
+
"savings": 0.9647,
|
| 66 |
+
"accuracy": 0.4
|
| 67 |
+
}
|
| 68 |
+
}
|
qwen2.5-32b/consistent/qk_dh128/ood_aime26.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9921,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9767,
|
| 10 |
+
"error_rate": 0.0,
|
| 11 |
+
"savings": 0.0252,
|
| 12 |
+
"accuracy": 1.0
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.9482,
|
| 16 |
+
"error_rate": 0.0,
|
| 17 |
+
"savings": 0.055,
|
| 18 |
+
"accuracy": 1.0
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8952,
|
| 22 |
+
"error_rate": 0.0,
|
| 23 |
+
"savings": 0.0915,
|
| 24 |
+
"accuracy": 1.0
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.8351,
|
| 28 |
+
"error_rate": 0.0333,
|
| 29 |
+
"savings": 0.2259,
|
| 30 |
+
"accuracy": 0.9667
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7674,
|
| 34 |
+
"error_rate": 0.1333,
|
| 35 |
+
"savings": 0.3766,
|
| 36 |
+
"accuracy": 0.8667
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.6921999999999999,
|
| 40 |
+
"error_rate": 0.1667,
|
| 41 |
+
"savings": 0.4618,
|
| 42 |
+
"accuracy": 0.8333
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.5946,
|
| 46 |
+
"error_rate": 0.2333,
|
| 47 |
+
"savings": 0.5934,
|
| 48 |
+
"accuracy": 0.7667
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.4928,
|
| 52 |
+
"error_rate": 0.3333,
|
| 53 |
+
"savings": 0.7437,
|
| 54 |
+
"accuracy": 0.6667
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.32909999999999995,
|
| 58 |
+
"error_rate": 0.4667,
|
| 59 |
+
"savings": 0.8902,
|
| 60 |
+
"accuracy": 0.5333
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.5333,
|
| 65 |
+
"savings": 0.9675,
|
| 66 |
+
"accuracy": 0.4667
|
| 67 |
+
}
|
| 68 |
+
}
|
qwen2.5-32b/consistent/qk_dh128/ood_gpqa_diamond.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9921,
|
| 4 |
+
"error_rate": 0.0202,
|
| 5 |
+
"savings": 0.0274,
|
| 6 |
+
"accuracy": 0.9798
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9767,
|
| 10 |
+
"error_rate": 0.0758,
|
| 11 |
+
"savings": 0.1833,
|
| 12 |
+
"accuracy": 0.9242
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.9482,
|
| 16 |
+
"error_rate": 0.202,
|
| 17 |
+
"savings": 0.3994,
|
| 18 |
+
"accuracy": 0.798
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8952,
|
| 22 |
+
"error_rate": 0.3283,
|
| 23 |
+
"savings": 0.6526,
|
| 24 |
+
"accuracy": 0.6717
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.8351,
|
| 28 |
+
"error_rate": 0.3889,
|
| 29 |
+
"savings": 0.7731,
|
| 30 |
+
"accuracy": 0.6111
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7674,
|
| 34 |
+
"error_rate": 0.4444,
|
| 35 |
+
"savings": 0.8559,
|
| 36 |
+
"accuracy": 0.5556
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.6921999999999999,
|
| 40 |
+
"error_rate": 0.4697,
|
| 41 |
+
"savings": 0.8948,
|
| 42 |
+
"accuracy": 0.5303
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.5946,
|
| 46 |
+
"error_rate": 0.4949,
|
| 47 |
+
"savings": 0.9192,
|
| 48 |
+
"accuracy": 0.5051
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.4928,
|
| 52 |
+
"error_rate": 0.5101,
|
| 53 |
+
"savings": 0.9511,
|
| 54 |
+
"accuracy": 0.4899
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.32909999999999995,
|
| 58 |
+
"error_rate": 0.5101,
|
| 59 |
+
"savings": 0.9607,
|
| 60 |
+
"accuracy": 0.4899
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.5101,
|
| 65 |
+
"savings": 0.9614,
|
| 66 |
+
"accuracy": 0.4899
|
| 67 |
+
}
|
| 68 |
+
}
|
qwen2.5-32b/consistent/qk_dh128/ood_math500.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9921,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0768,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9767,
|
| 10 |
+
"error_rate": 0.002,
|
| 11 |
+
"savings": 0.27,
|
| 12 |
+
"accuracy": 0.998
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.9482,
|
| 16 |
+
"error_rate": 0.008,
|
| 17 |
+
"savings": 0.4644,
|
| 18 |
+
"accuracy": 0.992
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8952,
|
| 22 |
+
"error_rate": 0.016,
|
| 23 |
+
"savings": 0.6371,
|
| 24 |
+
"accuracy": 0.984
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.8351,
|
| 28 |
+
"error_rate": 0.022,
|
| 29 |
+
"savings": 0.7205,
|
| 30 |
+
"accuracy": 0.978
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7674,
|
| 34 |
+
"error_rate": 0.04,
|
| 35 |
+
"savings": 0.783,
|
| 36 |
+
"accuracy": 0.96
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.6921999999999999,
|
| 40 |
+
"error_rate": 0.058,
|
| 41 |
+
"savings": 0.823,
|
| 42 |
+
"accuracy": 0.942
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.5946,
|
| 46 |
+
"error_rate": 0.072,
|
| 47 |
+
"savings": 0.8578,
|
| 48 |
+
"accuracy": 0.928
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.4928,
|
| 52 |
+
"error_rate": 0.086,
|
| 53 |
+
"savings": 0.8758,
|
| 54 |
+
"accuracy": 0.914
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.32909999999999995,
|
| 58 |
+
"error_rate": 0.098,
|
| 59 |
+
"savings": 0.8893,
|
| 60 |
+
"accuracy": 0.902
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.1,
|
| 65 |
+
"savings": 0.8907,
|
| 66 |
+
"accuracy": 0.9
|
| 67 |
+
}
|
| 68 |
+
}
|
qwen2.5-32b/consistent/qk_dh128/probe.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4d40434084b7f3190e0f96816cdecf4d243a3567f59100f78fc34f1ca07b6242
|
| 3 |
+
size 5246202
|
qwen2.5-32b/supervised/no_kq/config.json
ADDED
|
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"config": "configs/qwen32b_5k.yaml",
|
| 3 |
+
"method": "ttt",
|
| 4 |
+
"dataset_path": [
|
| 5 |
+
"data_prepare/output/qwen32b/s1k/dataset.pkl",
|
| 6 |
+
"data_prepare/output/qwen32b/openr1_2k/dataset.pkl",
|
| 7 |
+
"data_prepare/output/qwen32b/deepmath_2k/dataset.pkl"
|
| 8 |
+
],
|
| 9 |
+
"ood_paths": [
|
| 10 |
+
"data_prepare/output/qwen32b/aime24/dataset.pkl",
|
| 11 |
+
"data_prepare/output/qwen32b/aime25/dataset.pkl",
|
| 12 |
+
"data_prepare/output/qwen32b/aime26/dataset.pkl",
|
| 13 |
+
"data_prepare/output/qwen32b/math500/dataset.pkl",
|
| 14 |
+
"data_prepare/output/qwen32b/gpqa_diamond/dataset.pkl"
|
| 15 |
+
],
|
| 16 |
+
"output_dir": "results/qwen32b_5k",
|
| 17 |
+
"label_mode": "supervised",
|
| 18 |
+
"batch_size": 10,
|
| 19 |
+
"seed": 42,
|
| 20 |
+
"smooth_window": 10,
|
| 21 |
+
"run_name": "ttt__no_kq__lr0.01",
|
| 22 |
+
"d_hidden": 64,
|
| 23 |
+
"use_ln": false,
|
| 24 |
+
"use_residual": false,
|
| 25 |
+
"learnable_eta": false,
|
| 26 |
+
"base_lr": 0.01,
|
| 27 |
+
"share_kq": false,
|
| 28 |
+
"use_mlp": false,
|
| 29 |
+
"use_pca": false,
|
| 30 |
+
"pca_dim": 256,
|
| 31 |
+
"epochs": 20,
|
| 32 |
+
"outer_lr": 0.001,
|
| 33 |
+
"no_meta_train": false,
|
| 34 |
+
"no_online_update": false,
|
| 35 |
+
"no_kq": true,
|
| 36 |
+
"grad_clip": 1.0,
|
| 37 |
+
"force_retrain": false,
|
| 38 |
+
"d_phi": 5120,
|
| 39 |
+
"timestamp": "2026-03-27T19:49:22.309058",
|
| 40 |
+
"release_target": "qwen2.5-32b/supervised/no_kq",
|
| 41 |
+
"release_probe_source": "qwen32b_5k/supervised/ttt__no_kq__lr0.01/checkpoints/probe_ep20.pt"
|
| 42 |
+
}
|
qwen2.5-32b/supervised/no_kq/lambdas.json
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": 0.9489,
|
| 3 |
+
"0.025": 0.9215,
|
| 4 |
+
"0.05": 0.8896,
|
| 5 |
+
"0.1": 0.8326,
|
| 6 |
+
"0.15": 0.7989999999999999,
|
| 7 |
+
"0.2": 0.7598,
|
| 8 |
+
"0.25": 0.7142999999999999,
|
| 9 |
+
"0.3": 0.6740999999999999,
|
| 10 |
+
"0.35": 0.6171,
|
| 11 |
+
"0.4": 0.5069,
|
| 12 |
+
"0.5": 9.999999999998899e-05
|
| 13 |
+
}
|
qwen2.5-32b/supervised/no_kq/metrics.json
ADDED
|
@@ -0,0 +1,70 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"eps_results": {
|
| 3 |
+
"0.01": {
|
| 4 |
+
"lambda": 0.9489,
|
| 5 |
+
"error_rate": 0.01,
|
| 6 |
+
"savings": 0.0372,
|
| 7 |
+
"accuracy": 0.99
|
| 8 |
+
},
|
| 9 |
+
"0.025": {
|
| 10 |
+
"lambda": 0.9215,
|
| 11 |
+
"error_rate": 0.0266,
|
| 12 |
+
"savings": 0.1437,
|
| 13 |
+
"accuracy": 0.9734
|
| 14 |
+
},
|
| 15 |
+
"0.05": {
|
| 16 |
+
"lambda": 0.8896,
|
| 17 |
+
"error_rate": 0.0532,
|
| 18 |
+
"savings": 0.2817,
|
| 19 |
+
"accuracy": 0.9468
|
| 20 |
+
},
|
| 21 |
+
"0.1": {
|
| 22 |
+
"lambda": 0.8326,
|
| 23 |
+
"error_rate": 0.1098,
|
| 24 |
+
"savings": 0.4746,
|
| 25 |
+
"accuracy": 0.8902
|
| 26 |
+
},
|
| 27 |
+
"0.15": {
|
| 28 |
+
"lambda": 0.7989999999999999,
|
| 29 |
+
"error_rate": 0.1519,
|
| 30 |
+
"savings": 0.5749,
|
| 31 |
+
"accuracy": 0.8481
|
| 32 |
+
},
|
| 33 |
+
"0.2": {
|
| 34 |
+
"lambda": 0.7598,
|
| 35 |
+
"error_rate": 0.1918,
|
| 36 |
+
"savings": 0.6731,
|
| 37 |
+
"accuracy": 0.8082
|
| 38 |
+
},
|
| 39 |
+
"0.25": {
|
| 40 |
+
"lambda": 0.7142999999999999,
|
| 41 |
+
"error_rate": 0.2583,
|
| 42 |
+
"savings": 0.76,
|
| 43 |
+
"accuracy": 0.7417
|
| 44 |
+
},
|
| 45 |
+
"0.3": {
|
| 46 |
+
"lambda": 0.6740999999999999,
|
| 47 |
+
"error_rate": 0.2982,
|
| 48 |
+
"savings": 0.8183,
|
| 49 |
+
"accuracy": 0.7018
|
| 50 |
+
},
|
| 51 |
+
"0.35": {
|
| 52 |
+
"lambda": 0.6171,
|
| 53 |
+
"error_rate": 0.3514,
|
| 54 |
+
"savings": 0.8793,
|
| 55 |
+
"accuracy": 0.6486
|
| 56 |
+
},
|
| 57 |
+
"0.4": {
|
| 58 |
+
"lambda": 0.5069,
|
| 59 |
+
"error_rate": 0.388,
|
| 60 |
+
"savings": 0.9365,
|
| 61 |
+
"accuracy": 0.612
|
| 62 |
+
},
|
| 63 |
+
"0.5": {
|
| 64 |
+
"lambda": 9.999999999998899e-05,
|
| 65 |
+
"error_rate": 0.3947,
|
| 66 |
+
"savings": 0.9502,
|
| 67 |
+
"accuracy": 0.6053
|
| 68 |
+
}
|
| 69 |
+
}
|
| 70 |
+
}
|
qwen2.5-32b/supervised/no_kq/ood_aime24.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9489,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.007,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9215,
|
| 10 |
+
"error_rate": 0.0,
|
| 11 |
+
"savings": 0.0411,
|
| 12 |
+
"accuracy": 1.0
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.8896,
|
| 16 |
+
"error_rate": 0.0,
|
| 17 |
+
"savings": 0.0837,
|
| 18 |
+
"accuracy": 1.0
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8326,
|
| 22 |
+
"error_rate": 0.15,
|
| 23 |
+
"savings": 0.2932,
|
| 24 |
+
"accuracy": 0.85
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.7989999999999999,
|
| 28 |
+
"error_rate": 0.2,
|
| 29 |
+
"savings": 0.4065,
|
| 30 |
+
"accuracy": 0.8
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7598,
|
| 34 |
+
"error_rate": 0.25,
|
| 35 |
+
"savings": 0.4869,
|
| 36 |
+
"accuracy": 0.75
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.7142999999999999,
|
| 40 |
+
"error_rate": 0.25,
|
| 41 |
+
"savings": 0.5858,
|
| 42 |
+
"accuracy": 0.75
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.6740999999999999,
|
| 46 |
+
"error_rate": 0.3,
|
| 47 |
+
"savings": 0.666,
|
| 48 |
+
"accuracy": 0.7
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.6171,
|
| 52 |
+
"error_rate": 0.35,
|
| 53 |
+
"savings": 0.7817,
|
| 54 |
+
"accuracy": 0.65
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.5069,
|
| 58 |
+
"error_rate": 0.55,
|
| 59 |
+
"savings": 0.96,
|
| 60 |
+
"accuracy": 0.45
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.55,
|
| 65 |
+
"savings": 0.9683,
|
| 66 |
+
"accuracy": 0.45
|
| 67 |
+
}
|
| 68 |
+
}
|
qwen2.5-32b/supervised/no_kq/ood_aime25.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9489,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9215,
|
| 10 |
+
"error_rate": 0.0,
|
| 11 |
+
"savings": 0.0281,
|
| 12 |
+
"accuracy": 1.0
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.8896,
|
| 16 |
+
"error_rate": 0.0,
|
| 17 |
+
"savings": 0.0455,
|
| 18 |
+
"accuracy": 1.0
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8326,
|
| 22 |
+
"error_rate": 0.0556,
|
| 23 |
+
"savings": 0.265,
|
| 24 |
+
"accuracy": 0.9444
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.7989999999999999,
|
| 28 |
+
"error_rate": 0.0556,
|
| 29 |
+
"savings": 0.3621,
|
| 30 |
+
"accuracy": 0.9444
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7598,
|
| 34 |
+
"error_rate": 0.1111,
|
| 35 |
+
"savings": 0.5146,
|
| 36 |
+
"accuracy": 0.8889
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.7142999999999999,
|
| 40 |
+
"error_rate": 0.1667,
|
| 41 |
+
"savings": 0.6929,
|
| 42 |
+
"accuracy": 0.8333
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.6740999999999999,
|
| 46 |
+
"error_rate": 0.3333,
|
| 47 |
+
"savings": 0.7742,
|
| 48 |
+
"accuracy": 0.6667
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.6171,
|
| 52 |
+
"error_rate": 0.3333,
|
| 53 |
+
"savings": 0.8174,
|
| 54 |
+
"accuracy": 0.6667
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.5069,
|
| 58 |
+
"error_rate": 0.4444,
|
| 59 |
+
"savings": 0.9417,
|
| 60 |
+
"accuracy": 0.5556
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.4444,
|
| 65 |
+
"savings": 0.9529,
|
| 66 |
+
"accuracy": 0.5556
|
| 67 |
+
}
|
| 68 |
+
}
|
qwen2.5-32b/supervised/no_kq/ood_aime26.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9489,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0239,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9215,
|
| 10 |
+
"error_rate": 0.0,
|
| 11 |
+
"savings": 0.0305,
|
| 12 |
+
"accuracy": 1.0
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.8896,
|
| 16 |
+
"error_rate": 0.0,
|
| 17 |
+
"savings": 0.0744,
|
| 18 |
+
"accuracy": 1.0
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8326,
|
| 22 |
+
"error_rate": 0.05,
|
| 23 |
+
"savings": 0.1979,
|
| 24 |
+
"accuracy": 0.95
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.7989999999999999,
|
| 28 |
+
"error_rate": 0.15,
|
| 29 |
+
"savings": 0.3098,
|
| 30 |
+
"accuracy": 0.85
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7598,
|
| 34 |
+
"error_rate": 0.3,
|
| 35 |
+
"savings": 0.5139,
|
| 36 |
+
"accuracy": 0.7
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.7142999999999999,
|
| 40 |
+
"error_rate": 0.35,
|
| 41 |
+
"savings": 0.6549,
|
| 42 |
+
"accuracy": 0.65
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.6740999999999999,
|
| 46 |
+
"error_rate": 0.35,
|
| 47 |
+
"savings": 0.7077,
|
| 48 |
+
"accuracy": 0.65
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.6171,
|
| 52 |
+
"error_rate": 0.4,
|
| 53 |
+
"savings": 0.7691,
|
| 54 |
+
"accuracy": 0.6
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.5069,
|
| 58 |
+
"error_rate": 0.45,
|
| 59 |
+
"savings": 0.9326,
|
| 60 |
+
"accuracy": 0.55
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.55,
|
| 65 |
+
"savings": 0.9591,
|
| 66 |
+
"accuracy": 0.45
|
| 67 |
+
}
|
| 68 |
+
}
|
qwen2.5-32b/supervised/no_kq/ood_gpqa_diamond.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9489,
|
| 4 |
+
"error_rate": 0.04,
|
| 5 |
+
"savings": 0.1684,
|
| 6 |
+
"accuracy": 0.96
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9215,
|
| 10 |
+
"error_rate": 0.13,
|
| 11 |
+
"savings": 0.3363,
|
| 12 |
+
"accuracy": 0.87
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.8896,
|
| 16 |
+
"error_rate": 0.21,
|
| 17 |
+
"savings": 0.5039,
|
| 18 |
+
"accuracy": 0.79
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8326,
|
| 22 |
+
"error_rate": 0.3,
|
| 23 |
+
"savings": 0.7154,
|
| 24 |
+
"accuracy": 0.7
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.7989999999999999,
|
| 28 |
+
"error_rate": 0.34,
|
| 29 |
+
"savings": 0.8213,
|
| 30 |
+
"accuracy": 0.66
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7598,
|
| 34 |
+
"error_rate": 0.39,
|
| 35 |
+
"savings": 0.8965,
|
| 36 |
+
"accuracy": 0.61
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.7142999999999999,
|
| 40 |
+
"error_rate": 0.41,
|
| 41 |
+
"savings": 0.9342,
|
| 42 |
+
"accuracy": 0.59
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.6740999999999999,
|
| 46 |
+
"error_rate": 0.41,
|
| 47 |
+
"savings": 0.9494,
|
| 48 |
+
"accuracy": 0.59
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.6171,
|
| 52 |
+
"error_rate": 0.41,
|
| 53 |
+
"savings": 0.9566,
|
| 54 |
+
"accuracy": 0.59
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.5069,
|
| 58 |
+
"error_rate": 0.41,
|
| 59 |
+
"savings": 0.9567,
|
| 60 |
+
"accuracy": 0.59
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.41,
|
| 65 |
+
"savings": 0.9567,
|
| 66 |
+
"accuracy": 0.59
|
| 67 |
+
}
|
| 68 |
+
}
|
qwen2.5-32b/supervised/no_kq/ood_math500.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9489,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0623,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.9215,
|
| 10 |
+
"error_rate": 0.0,
|
| 11 |
+
"savings": 0.2042,
|
| 12 |
+
"accuracy": 1.0
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.8896,
|
| 16 |
+
"error_rate": 0.0062,
|
| 17 |
+
"savings": 0.3908,
|
| 18 |
+
"accuracy": 0.9938
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.8326,
|
| 22 |
+
"error_rate": 0.0227,
|
| 23 |
+
"savings": 0.637,
|
| 24 |
+
"accuracy": 0.9773
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.7989999999999999,
|
| 28 |
+
"error_rate": 0.033,
|
| 29 |
+
"savings": 0.7208,
|
| 30 |
+
"accuracy": 0.967
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.7598,
|
| 34 |
+
"error_rate": 0.0495,
|
| 35 |
+
"savings": 0.7815,
|
| 36 |
+
"accuracy": 0.9505
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.7142999999999999,
|
| 40 |
+
"error_rate": 0.066,
|
| 41 |
+
"savings": 0.8267,
|
| 42 |
+
"accuracy": 0.934
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.6740999999999999,
|
| 46 |
+
"error_rate": 0.068,
|
| 47 |
+
"savings": 0.8473,
|
| 48 |
+
"accuracy": 0.932
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.6171,
|
| 52 |
+
"error_rate": 0.0866,
|
| 53 |
+
"savings": 0.8708,
|
| 54 |
+
"accuracy": 0.9134
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.5069,
|
| 58 |
+
"error_rate": 0.0907,
|
| 59 |
+
"savings": 0.8823,
|
| 60 |
+
"accuracy": 0.9093
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.0948,
|
| 65 |
+
"savings": 0.8885,
|
| 66 |
+
"accuracy": 0.9052
|
| 67 |
+
}
|
| 68 |
+
}
|
qwen2.5-32b/supervised/no_kq/probe.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6ce9b16ed9382dc67d63db1ceecbbe64f512f3ce7d152313a7cc60385bb1a385
|
| 3 |
+
size 22652
|
qwen2.5-32b/supervised/qk_dh128/config.json
ADDED
|
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"config": "configs/qwen32b_5k.yaml",
|
| 3 |
+
"method": "ttt",
|
| 4 |
+
"dataset_path": [
|
| 5 |
+
"data_prepare/output/qwen32b/s1k/dataset.pkl",
|
| 6 |
+
"data_prepare/output/qwen32b/openr1_2k/dataset.pkl",
|
| 7 |
+
"data_prepare/output/qwen32b/deepmath_2k/dataset.pkl"
|
| 8 |
+
],
|
| 9 |
+
"ood_paths": [
|
| 10 |
+
"data_prepare/output/qwen32b/aime24/dataset.pkl",
|
| 11 |
+
"data_prepare/output/qwen32b/aime25/dataset.pkl",
|
| 12 |
+
"data_prepare/output/qwen32b/aime26/dataset.pkl",
|
| 13 |
+
"data_prepare/output/qwen32b/math500/dataset.pkl",
|
| 14 |
+
"data_prepare/output/qwen32b/gpqa_diamond/dataset.pkl"
|
| 15 |
+
],
|
| 16 |
+
"output_dir": "results/qwen32b_5k",
|
| 17 |
+
"label_mode": "supervised",
|
| 18 |
+
"batch_size": 10,
|
| 19 |
+
"seed": 42,
|
| 20 |
+
"smooth_window": 10,
|
| 21 |
+
"run_name": "ttt__dh128__lr0.01",
|
| 22 |
+
"d_hidden": 128,
|
| 23 |
+
"use_ln": false,
|
| 24 |
+
"use_residual": false,
|
| 25 |
+
"learnable_eta": false,
|
| 26 |
+
"base_lr": 0.01,
|
| 27 |
+
"share_kq": false,
|
| 28 |
+
"use_mlp": false,
|
| 29 |
+
"use_pca": false,
|
| 30 |
+
"pca_dim": 256,
|
| 31 |
+
"epochs": 10,
|
| 32 |
+
"outer_lr": 0.001,
|
| 33 |
+
"no_meta_train": false,
|
| 34 |
+
"no_online_update": false,
|
| 35 |
+
"no_kq": false,
|
| 36 |
+
"grad_clip": 1.0,
|
| 37 |
+
"force_retrain": false,
|
| 38 |
+
"d_phi": 5120,
|
| 39 |
+
"timestamp": "2026-03-28T00:26:53.748545",
|
| 40 |
+
"release_target": "qwen2.5-32b/supervised/qk_dh128",
|
| 41 |
+
"release_probe_source": "qwen32b_5k/supervised/ttt__dh128__lr0.01/checkpoints/probe_ep10.pt"
|
| 42 |
+
}
|
qwen2.5-32b/supervised/qk_dh128/lambdas.json
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": 0.9929,
|
| 3 |
+
"0.025": 0.987,
|
| 4 |
+
"0.05": 0.9749,
|
| 5 |
+
"0.1": 0.9419,
|
| 6 |
+
"0.15": 0.9018,
|
| 7 |
+
"0.2": 0.8491,
|
| 8 |
+
"0.25": 0.7923,
|
| 9 |
+
"0.3": 0.7335,
|
| 10 |
+
"0.35": 0.6254,
|
| 11 |
+
"0.4": 0.39059999999999995,
|
| 12 |
+
"0.5": 9.999999999998899e-05
|
| 13 |
+
}
|
qwen2.5-32b/supervised/qk_dh128/metrics.json
ADDED
|
@@ -0,0 +1,70 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"eps_results": {
|
| 3 |
+
"0.01": {
|
| 4 |
+
"lambda": 0.9929,
|
| 5 |
+
"error_rate": 0.01,
|
| 6 |
+
"savings": 0.0466,
|
| 7 |
+
"accuracy": 0.99
|
| 8 |
+
},
|
| 9 |
+
"0.025": {
|
| 10 |
+
"lambda": 0.987,
|
| 11 |
+
"error_rate": 0.0211,
|
| 12 |
+
"savings": 0.1107,
|
| 13 |
+
"accuracy": 0.9789
|
| 14 |
+
},
|
| 15 |
+
"0.05": {
|
| 16 |
+
"lambda": 0.9749,
|
| 17 |
+
"error_rate": 0.0455,
|
| 18 |
+
"savings": 0.2332,
|
| 19 |
+
"accuracy": 0.9545
|
| 20 |
+
},
|
| 21 |
+
"0.1": {
|
| 22 |
+
"lambda": 0.9419,
|
| 23 |
+
"error_rate": 0.1031,
|
| 24 |
+
"savings": 0.4141,
|
| 25 |
+
"accuracy": 0.8969
|
| 26 |
+
},
|
| 27 |
+
"0.15": {
|
| 28 |
+
"lambda": 0.9018,
|
| 29 |
+
"error_rate": 0.1497,
|
| 30 |
+
"savings": 0.5596,
|
| 31 |
+
"accuracy": 0.8503
|
| 32 |
+
},
|
| 33 |
+
"0.2": {
|
| 34 |
+
"lambda": 0.8491,
|
| 35 |
+
"error_rate": 0.204,
|
| 36 |
+
"savings": 0.674,
|
| 37 |
+
"accuracy": 0.796
|
| 38 |
+
},
|
| 39 |
+
"0.25": {
|
| 40 |
+
"lambda": 0.7923,
|
| 41 |
+
"error_rate": 0.2506,
|
| 42 |
+
"savings": 0.7552,
|
| 43 |
+
"accuracy": 0.7494
|
| 44 |
+
},
|
| 45 |
+
"0.3": {
|
| 46 |
+
"lambda": 0.7335,
|
| 47 |
+
"error_rate": 0.2905,
|
| 48 |
+
"savings": 0.8134,
|
| 49 |
+
"accuracy": 0.7095
|
| 50 |
+
},
|
| 51 |
+
"0.35": {
|
| 52 |
+
"lambda": 0.6254,
|
| 53 |
+
"error_rate": 0.3437,
|
| 54 |
+
"savings": 0.8837,
|
| 55 |
+
"accuracy": 0.6563
|
| 56 |
+
},
|
| 57 |
+
"0.4": {
|
| 58 |
+
"lambda": 0.39059999999999995,
|
| 59 |
+
"error_rate": 0.3902,
|
| 60 |
+
"savings": 0.9407,
|
| 61 |
+
"accuracy": 0.6098
|
| 62 |
+
},
|
| 63 |
+
"0.5": {
|
| 64 |
+
"lambda": 9.999999999998899e-05,
|
| 65 |
+
"error_rate": 0.3947,
|
| 66 |
+
"savings": 0.9502,
|
| 67 |
+
"accuracy": 0.6053
|
| 68 |
+
}
|
| 69 |
+
}
|
| 70 |
+
}
|
qwen2.5-32b/supervised/qk_dh128/ood_aime24.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"0.01": {
|
| 3 |
+
"lambda": 0.9929,
|
| 4 |
+
"error_rate": 0.0,
|
| 5 |
+
"savings": 0.0527,
|
| 6 |
+
"accuracy": 1.0
|
| 7 |
+
},
|
| 8 |
+
"0.025": {
|
| 9 |
+
"lambda": 0.987,
|
| 10 |
+
"error_rate": 0.05,
|
| 11 |
+
"savings": 0.1005,
|
| 12 |
+
"accuracy": 0.95
|
| 13 |
+
},
|
| 14 |
+
"0.05": {
|
| 15 |
+
"lambda": 0.9749,
|
| 16 |
+
"error_rate": 0.05,
|
| 17 |
+
"savings": 0.1472,
|
| 18 |
+
"accuracy": 0.95
|
| 19 |
+
},
|
| 20 |
+
"0.1": {
|
| 21 |
+
"lambda": 0.9419,
|
| 22 |
+
"error_rate": 0.1,
|
| 23 |
+
"savings": 0.2949,
|
| 24 |
+
"accuracy": 0.9
|
| 25 |
+
},
|
| 26 |
+
"0.15": {
|
| 27 |
+
"lambda": 0.9018,
|
| 28 |
+
"error_rate": 0.15,
|
| 29 |
+
"savings": 0.4545,
|
| 30 |
+
"accuracy": 0.85
|
| 31 |
+
},
|
| 32 |
+
"0.2": {
|
| 33 |
+
"lambda": 0.8491,
|
| 34 |
+
"error_rate": 0.2,
|
| 35 |
+
"savings": 0.5534,
|
| 36 |
+
"accuracy": 0.8
|
| 37 |
+
},
|
| 38 |
+
"0.25": {
|
| 39 |
+
"lambda": 0.7923,
|
| 40 |
+
"error_rate": 0.25,
|
| 41 |
+
"savings": 0.6954,
|
| 42 |
+
"accuracy": 0.75
|
| 43 |
+
},
|
| 44 |
+
"0.3": {
|
| 45 |
+
"lambda": 0.7335,
|
| 46 |
+
"error_rate": 0.35,
|
| 47 |
+
"savings": 0.7598,
|
| 48 |
+
"accuracy": 0.65
|
| 49 |
+
},
|
| 50 |
+
"0.35": {
|
| 51 |
+
"lambda": 0.6254,
|
| 52 |
+
"error_rate": 0.45,
|
| 53 |
+
"savings": 0.8599,
|
| 54 |
+
"accuracy": 0.55
|
| 55 |
+
},
|
| 56 |
+
"0.4": {
|
| 57 |
+
"lambda": 0.39059999999999995,
|
| 58 |
+
"error_rate": 0.55,
|
| 59 |
+
"savings": 0.9566,
|
| 60 |
+
"accuracy": 0.45
|
| 61 |
+
},
|
| 62 |
+
"0.5": {
|
| 63 |
+
"lambda": 9.999999999998899e-05,
|
| 64 |
+
"error_rate": 0.55,
|
| 65 |
+
"savings": 0.9683,
|
| 66 |
+
"accuracy": 0.45
|
| 67 |
+
}
|
| 68 |
+
}
|