| --- |
| license: mit |
| library_name: pytorch |
| tags: |
| - test-time-training |
| - conformal-prediction |
| - reasoning |
| - early-stopping |
| - llm |
| datasets: |
| - wzekai99/ORCA |
| --- |
| |
| # ORCA TTT-Probes |
|
|
| Trained Test-Time Training probes for *Online Reasoning Calibration: Test-Time Training Enables Generalizable Conformal LLM Reasoning* ([arXiv:2604.01170](https://arxiv.org/abs/2604.01170)). |
|
|
| ## Layout (17 probes) |
|
|
| ``` |
| qwen2.5-32b/supervised/{no_kq, qk_dh128, |
| qk_dh32, qk_dh64, qk_dh256, qk_dh512, |
| qk_dh128_ln, qk_dh128_ln_res, qk_dh128_share_kq, |
| qk_dh128_eta_learn, qk_dh128_mlp}/ |
| qwen2.5-32b/consistent/{no_kq, qk_dh128}/ |
| qwq-32b/supervised/{no_kq, qk_dh128}/ |
| llama-3.3-70b/supervised/{no_kq, qk_dh128}/ |
| ``` |
|
|
| Per probe directory: |
|
|
| | File | Contents | |
| |-------------------|----------------------------------------------------------------| |
| | `probe.pt` | State dict: W0, b0, log_eta; QK variants also include theta_K, theta_Q | |
| | `config.json` | Training hyperparameters (d_hidden, base_lr, epochs, ...) | |
| | `lambdas.json` | LTT thresholds, keyed by delta | |
| | `metrics.json` | Step-level savings and error rate per delta | |
| | `ood_*.json` | Per-OOD-benchmark metrics (Qwen2.5-32B probes only) | |
| |
| ## Use |
| |
| Probes are loaded by the `TTTProbe` class in https://github.com/wzekai99/ORCA. Quick example: |
| |
| ```bash |
| hf download wzekai99/ORCA --local-dir probes |
| hf download wzekai99/ORCA --repo-type dataset --local-dir data |
| python code/test.py \ |
| --method ttt --no_kq \ |
| --dataset_path data/qwen2.5-32b/s1k.pkl \ |
| data/qwen2.5-32b/openr1_2k.pkl \ |
| data/qwen2.5-32b/deepmath_2k.pkl \ |
| --probe_path probes/qwen2.5-32b/supervised/no_kq/probe.pt \ |
| --label_mode supervised --delta 0.1 --epsilon 0.05 |
| ``` |
| |
| ## License |
| |
| MIT. |
| |
| ## Citation |
| |
| ```bibtex |
| @article{zhou2026online, |
| title={Online Reasoning Calibration: Test-Time Training Enables Generalizable Conformal LLM Reasoning}, |
| author={Zhou, Cai and Wang, Zekai and Wu, Menghua and Zhu, Qianyu Julie and Shi, Flora C and Wang, Chenyu and Wilson, Ashia and Jaakkola, Tommi and Bates, Stephen}, |
| journal={arXiv preprint arXiv:2604.01170}, |
| year={2026} |
| } |
| ``` |
| |