YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
jaxgmg2_alpha1_patt
Checkpoints of RL agents from al_1.0_g_0.98_x100 retrained using patterning: a technique that
perturbs the training distribution by modifying the state visitation distribution Lambda.
Base model: al_1.0_g_0.98_x100/al_1.0_g_0.98_id_19_seed_981019.
Patterning formula:
pattern_env = (1 - t) * Lambda + t * delta_Lambda / ||delta_Lambda||_1
where delta_Lambda is derived from susceptibility analysis (mp-inv mode: chi+ @ delta, using the Moore-Penrose pseudo-inverse).
WandB: https://wandb.ai/devinterp/jaxgmg2_patt
Sweep
patt_t sweep: 0.05, 0.1, 0.3 (3 runs).
Shared Hyperparams
rl_action=train
alpha=1.0
discount_rate=0.98
lr=5e-05
num_total_env_steps=7372800000
num_rollout_steps=64
num_levels=9600
env_layout=open
mask_type=first_episode
use_prev_action=False
grad_acc_per_chunk=4
log_optimizer_state=True
patt_mode=mp-inv
patt_cluster=east
suscept_id=2000
resume=jaxgmg2_3phase_optim_state/al_1.0_g_0.98_id_19_seed_981019
resume_id=175
resume_optim=True
eval_schedule=0:1,250:2,500:5,2000:10
wandb_project=jaxgmg2_patt
use_wandb=True
use_hf=True
Naming Schema
Checkpoints follow the pattern:
al_1.0_g_0.98_id_{run_id}_seed_{seed}_patt_mp-inv_t_{t}_ld-opt_1_ckpt_{ckpt}_sus_2000_nbeta:1000_perturbation_type:init_state
Reproduced with
See train.yaml in this repository. Run from the
timaeus monorepo.