| # Determinism in Temporal Twins |
|
|
| ## Summary |
|
|
| Temporal Twins uses deterministic seeding and deterministic runtime settings so that the generated matched-prefix datasets, audit counts, and benchmark metrics are reproducible across reruns of the same configuration and seed. |
|
|
| ## Seeding |
|
|
| The benchmark runtime sets deterministic seeds for: |
|
|
| - Python `random` |
| - NumPy |
| - PyTorch |
| - CUDA via `torch.cuda.manual_seed_all(...)` when CUDA is available |
|
|
| Difficulty- and benchmark-mode-derived seeds use a stable hash function rather than Python's process-randomized `hash()`. |
|
|
| ## Deterministic Torch Configuration |
|
|
| When supported by the runtime, the benchmark enables: |
|
|
| - `torch.backends.cudnn.deterministic = True` |
| - `torch.backends.cudnn.benchmark = False` |
| - `torch.use_deterministic_algorithms(True)` |
|
|
| The runtime also disables opportunistic nondeterministic math paths where practical and constrains CPU threading for repeatability. |
|
|
| ## CPU Deterministic Mode |
|
|
| The deterministic paper suite was run in a CPU-oriented deterministic configuration. This favors repeatability over throughput and is the recommended mode for artifact evaluation and paper reproduction. |
|
|
| ## Expected Reproducibility Behavior |
|
|
| - The generated matched-prefix dataset should be identical for the same benchmark mode, difficulty, and seed. |
| - Audit counts and shortcut AUCs should be identical for the same configuration and seed. |
| - Model metrics are expected to be identical or numerically indistinguishable when run under the same deterministic environment. |
|
|
| ## Runtime Tradeoff |
|
|
| Deterministic execution is slower than unconstrained training because it restricts thread-level and backend-level nondeterministic optimizations. This is expected, especially for larger non-fast calibration runs and the full paper suite. |
|
|
| ## Hosted Resources |
|
|
| - Dataset URL: `https://huggingface.co/datasets/temporal-twins-benchmark/temporal-twins` |
| - Code repository URL: `https://huggingface.co/temporal-twins-benchmark/temporal-twins-code` |
| - Croissant metadata URL: `https://huggingface.co/datasets/temporal-twins-benchmark/temporal-twins/raw/main/metadata/temporal_twins_croissant.json` |
| - Paper or preprint: Not available during double-blind review; to be added after publication. |
|
|