Adam Qwen3-8B AO Reupload: Full Mix Synthetic QA v3 Replace LQA

This repo is a ceselder re-upload of Adam Karvonen's checkpoints_Qwen3-8B_full_mix_synthetic_qa_v3_replace_lqa adapter with a concrete card derived from its bundled ao_config.json.

What This Checkpoint Is

  • Source repo: adamkarvonen/checkpoints_Qwen3-8B_full_mix_synthetic_qa_v3_replace_lqa
  • Base model: Qwen/Qwen3-8B
  • Adapter format: PEFT LoRA
  • Hook layer: 1
  • Activation readout layers in ao_config.json: joint 3-layer readout [9, 18, 27] corresponding to 25/50/75% depth
  • Seed: 42
  • Learning rate: 1e-5
  • Train batch size: 16
  • Epochs: 1
  • Exact token count: not documented in the source repo model card as of 2026-03-30

Best-Available Training Mixture

The checkpoint's ao_config.json lists the following dataset loaders:

  • past_lens single-activation loader: num_train=67000, max_k_activations=1, directions=[past,future], max_length=2000
  • past_lens multi-activation loader: num_train=67000, max_k_activations=50, directions=[past,future], max_length=2000
  • synthetic_qa: num_train=199082, data_path=datasets/training_data/artifacts/synthetic_qa_2gpu_100k/training_data.json
  • classification families, each with a single-token loader (max_window_size=1) and a multi-token loader (max_window_size=50), generally with num_train=1588:
    • geometry_of_truth
    • relations
    • sst2
    • md_gender
    • snli
    • ag_news
    • ner
    • tense
    • language_identification
    • singular_plural (listed with num_train=0)

Interpretation

  • The checkpoint name and bundled config both indicate that this is a full-mix run where synthetic_qa_v3 replaces LatentQA.
  • Unlike Adam's older default Qwen3-8B checkpoint, the bundled config here uses a joint 3-layer readout instead of separate single-layer variants.

Notes

  • This card is based on the source repo's bundled ao_config.json because the original README was still a placeholder.
  • This repo mirrors the source adapter files and ao_config.json, with only the README replaced.
Downloads last month
152
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ceselder/adam-reupload-qwen3-8b-full-mix-synthetic-qa-v3-replace-lqa

Finetuned
Qwen/Qwen3-8B
Adapter
(1071)
this model

Collection including ceselder/adam-reupload-qwen3-8b-full-mix-synthetic-qa-v3-replace-lqa