--- license: mit library_name: pytorch tags: - test-time-training - conformal-prediction - reasoning - early-stopping - llm datasets: - wzekai99/ORCA --- # ORCA TTT-Probes Trained Test-Time Training probes for *Online Reasoning Calibration: Test-Time Training Enables Generalizable Conformal LLM Reasoning* ([arXiv:2604.01170](https://arxiv.org/abs/2604.01170)). ## Layout (17 probes) ``` qwen2.5-32b/supervised/{no_kq, qk_dh128, qk_dh32, qk_dh64, qk_dh256, qk_dh512, qk_dh128_ln, qk_dh128_ln_res, qk_dh128_share_kq, qk_dh128_eta_learn, qk_dh128_mlp}/ qwen2.5-32b/consistent/{no_kq, qk_dh128}/ qwq-32b/supervised/{no_kq, qk_dh128}/ llama-3.3-70b/supervised/{no_kq, qk_dh128}/ ``` Per probe directory: | File | Contents | |-------------------|----------------------------------------------------------------| | `probe.pt` | State dict: W0, b0, log_eta; QK variants also include theta_K, theta_Q | | `config.json` | Training hyperparameters (d_hidden, base_lr, epochs, ...) | | `lambdas.json` | LTT thresholds, keyed by delta | | `metrics.json` | Step-level savings and error rate per delta | | `ood_*.json` | Per-OOD-benchmark metrics (Qwen2.5-32B probes only) | ## Use Probes are loaded by the `TTTProbe` class in https://github.com/wzekai99/ORCA. Quick example: ```bash hf download wzekai99/ORCA --local-dir probes hf download wzekai99/ORCA --repo-type dataset --local-dir data python code/test.py \ --method ttt --no_kq \ --dataset_path data/qwen2.5-32b/s1k.pkl \ data/qwen2.5-32b/openr1_2k.pkl \ data/qwen2.5-32b/deepmath_2k.pkl \ --probe_path probes/qwen2.5-32b/supervised/no_kq/probe.pt \ --label_mode supervised --delta 0.1 --epsilon 0.05 ``` ## License MIT. ## Citation ```bibtex @article{zhou2026online, title={Online Reasoning Calibration: Test-Time Training Enables Generalizable Conformal LLM Reasoning}, author={Zhou, Cai and Wang, Zekai and Wu, Menghua and Zhu, Qianyu Julie and Shi, Flora C and Wang, Chenyu and Wilson, Ashia and Jaakkola, Tommi and Bates, Stephen}, journal={arXiv preprint arXiv:2604.01170}, year={2026} } ```