Archival release on Zenodo: 10.5281/zenodo.19451337

AX-CPT LLM Control: Minimal Public Release

Minimal public release for two AX-CPT style language-model experiments, accuracy-based transition summaries, and exploratory sequence/embedding audits:

This repo may be useful for community exploration of simple prompt-level monitoring/reminder variants on a compact cognitive-control task.

AX-CPT is a cue-probe task where A-X is the target sequence and A-Y, B-X, and B-Y are nontarget lures; in a conservative Braver-style proactive/reactive control framing, AY and BX trials are informative because they stress sustained target preparation and cue-context use or retrieval.

  • axcpt_v4b.py: explicit AX-CPT task with a single-pass no-recheck control loop.
  • axcpt_v5_dcm.py: implicit pattern-learning AX-CPT task with a small Dual Control Manager (DCM) monitoring/reminder condition.

This release is intentionally small. DCM is treated as a conservative monitoring/intervention demo, not as evidence of intervention efficacy. Older exploratory scripts and messy intermediate outputs are preserved under archive/, while the public-facing raw data used for regenerated summaries lives under data/raw/.

Release packages:

  • Primary release: axcpt-llm-control-minimal-public-release-with-embedding-analysis.zip, including core scripts, raw data, generated accuracy summaries, deterministic sequence representations, exploratory embedding-analysis outputs, and figures.
  • Lightweight alternative: axcpt-llm-control-minimal-public-release.zip, omitting the generated embedding-analysis output directories while keeping the scripts needed to regenerate them.

Main Findings

These are descriptive summaries from the generated outputs, not causal claims.

  • In the v4b/v5 release data, v4b conditions were near ceiling on AX, AY, and BX accuracy, so accuracy-based transition deltas were mostly small. The v5 DCM runs showed larger BX error than matched BASE runs in these files: window 5 DCM 0.45 vs BASE 0.20, and window 10 DCM 0.55 vs BASE 0.25. These transition summaries are accuracy-based only, not reaction-time costs.
  • In the exploratory embedding audit, v4b vs v5 separation remained relatively robust after stricter leakage reduction, while apparent DCM vs BASE separation weakened substantially. In the strict condition-level comparison, mean cosine similarity was 0.8699 for v4b vs v5 and 0.9998 for v5 BASE vs DCM; this reflects serialized observable fields, not latent model states.
  • In a small v6 repeated-run process check with the current trigger held fixed, selective DCM was much sparser than dense DCM: mean invocation rate 0.1867 vs 0.8283 across three valid runs. It did not outperform BASE on BX error in that small batch: selective 0.3833, dense 0.4667, BASE 0.2167.

Setup

Use Python 3.10+.

python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt

For an environment snapshot from this machine, see requirements-frozen.txt. The frozen file is a reproducibility reference and may include local/Conda packages beyond this release; requirements.txt remains the minimal dependency list.

Create a local .env file in the project root before running the experiment scripts:

ANTHROPIC_API_KEY=your_api_key_here

The local .env file is ignored by git and should not be committed.

Run Experiments

python axcpt_v4b.py
python axcpt_v5_dcm.py

Both scripts write timestamped CSV and JSON files to the current working directory. Generated outputs are ignored by git.

Rebuild Release Outputs

The release summaries are regenerated from the raw CSV/JSON files under data/raw/, not from a manually edited spreadsheet.

python scripts/rebuild_release_outputs.py

To also try writing an XLSX workbook:

python scripts/rebuild_release_outputs.py --xlsx

The workbook is optional. If openpyxl is not installed, the script still writes the CSV, JSONL, and metric availability report.

Generated files go to outputs/:

  • clean_summary_v4b.csv
  • clean_summary_v5_dcm.csv
  • accuracy_transition_summary.csv
  • deterministic_time_series_features.csv
  • trial_level_representations.csv, .json, and .jsonl
  • sliding_window_representations.csv, .json, and .jsonl
  • condition_level_representations.csv, .json, and .jsonl
  • representation_construction.md
  • metric_availability_report.md
  • release_outputs.xlsx, only when requested and available

The representation files contain deterministic serialized trial, sliding-window, and condition-level sequences for downstream embedding or similarity analysis. They are not latent model embedding files and do not contain hidden states.

Construction is documented in outputs/representation_construction.md: rows are grouped by (dataset, condition), sorted by trial_idx, serialized into explicit key/value trial tokens, then joined into trailing windows of 5, 10, and 20 trials and full-condition sequences.

Exploratory Embedding Analysis

To compute actual text-derived embeddings for the condition-level and sliding-window serialized representations:

python scripts/run_embedding_analysis.py

Outputs go to outputs/embedding_analysis/. This uses a local deterministic hashed token n-gram vectorizer implemented in scripts/run_embedding_analysis.py with numpy for cosine similarity and PCA. These are exploratory text embeddings of the serialized representations, not neural embeddings, latent model hidden states, logits, probabilities, costs, or reaction times.

For a leakage-reduced check that removes explicit dataset labels, condition labels, and direct DCM indicator fields from the embedded text:

python scripts/run_leakage_reduced_embedding_analysis.py

Outputs go to outputs/embedding_analysis_leakage_reduced/; the exact removed fields are documented in leakage_reduced_masking_report.md.

For the stricter version that also removes explicit context_window tokens from the embedded text:

python scripts/run_strict_leakage_reduced_embedding_analysis.py

Outputs go to outputs/embedding_analysis_strict_leakage_reduced/.

Figures

Graphical abstract

Graphical abstract

Figure 1. AX-CPT task schematic

AX-CPT task schematic

Figure 2. Accuracy-based transition structure

Accuracy-based transition structure

Figure 3. Initial vs strict leakage-reduced embedding projection

Embedding projection

The transition figure is accuracy-based; it is not a reaction-time transition-cost plot.

Metric Availability

Computable from the available raw files:

  • trial counts by condition
  • AX hit rate, AY error rate, BX error rate, BY accuracy
  • context d-prime proxy from AX hits and BX false alarms
  • PBI-style error-bias index
  • invalid response rate
  • recheck rate for axcpt_v4b.py
  • DCM invocation rate for axcpt_v5_dcm.py
  • accuracy-only transition summaries by previous trial type and current trial type
  • deterministic trial-level time-series feature tables
  • serialized trial, sliding-window, and condition-level sequences for downstream embedding or similarity analysis

Not computable from the available raw files:

  • reaction-time transition costs, because no reaction-time column is present
  • true latent model embeddings, because no hidden-state vectors or embedding vectors are present
  • token-level probability/logit analyses, because token probabilities or logits were not logged
  • cost, latency, or API usage summaries, because request timing and token usage were not logged

The rebuild script writes a fresh outputs/metric_availability_report.md with the same distinction.

Repository Layout

.
  axcpt_v4b.py
  axcpt_v5_dcm.py
  requirements.txt
  requirements-frozen.txt
  data/raw/
    axcpt_v4b/
    axcpt_v5_dcm/
  figures/
  scripts/
    rebuild_release_outputs.py
    run_embedding_analysis.py
    run_leakage_reduced_embedding_analysis.py
    run_strict_leakage_reduced_embedding_analysis.py
    make_release_figures.py
  archive/
    exploratory_scripts/
    other_outputs/
    misc/
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support