YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

jaxgmg2_mixture_east

Note: Einar trained these models and the description below is uncertain.

~243 RL agent checkpoints trained on the JaxGMG maze environment with a mixture training distribution: a blend of uniform state visitation and a biased distribution where the mouse starts east of the cheese. Used to study how training distribution mixtures affect learned behaviour and phase transitions.

WandB: https://wandb.ai/devinterp/jaxgmg2_mixture

Contents

This repository contains checkpoints from several distinct experiments with different mixture ratios:

Group Mixture alpha Steps Seeds YAML
mixture_90uniform_10east_* 90% uniform + 10% east 0.0 10B 42-151 (no yaml found)
mixture_85uniform_15east_* 85% uniform + 15% east 1.0 2B 152-161 mixture_train_15_east.yaml
mixture_80uniform_20east_* 80% uniform + 20% east 1.0 2B 162-361 mixture_train_20_east.yaml, mixture_train_20_east_x90.yaml
mixture_70uniform_30east_* 70% uniform + 30% east 1.0 2B 172-271 mixture_train_30_east.yaml, mixture_train_30_east_x90.yaml
mixture_60uniform_40east_* 60% uniform + 40% east 1.0 2B 182-191 mixture_train_40_east.yaml
mixture_100uniform_* 100% uniform (control) — — 52-54 (no yaml found)

Note: The 90/10 group uses alpha=0.0 and cheese_loc=corner, making it from an earlier/different experiment. The 100uniform control group is also from an earlier experiment with uncertain hyperparams. All other groups use alpha=1.0 and cheese_loc=any.

Shared Hyperparams (for 85/15 through 60/40 groups)

rl_action=train
alpha=1.0
discount_rate=0.98
lr=5e-05
num_total_env_steps=2000000000
num_rollout_steps=64
num_levels=9600
cheese_loc=any
env_layout=open
env_size=13
log_optimizer_state=True
ckpt_dir=jaxgmg2_mixture_east
wandb_project=jaxgmg2_mixture
use_wandb=True
use_hf=True

The env_rule_mixture parameter varies per group (see individual yamls).

Naming Schema

Checkpoints are named mixture_{p}uniform_{100-p}east_seed_{seed}.

Reproduced with

Yamls from rl/einar/pattern-merge-runpod branch:

timaeus run mixture_train_15_east.yaml
timaeus run mixture_train_20_east.yaml
timaeus run mixture_train_20_east_x90.yaml
timaeus run mixture_train_30_east.yaml
timaeus run mixture_train_30_east_x90.yaml
timaeus run mixture_train_40_east.yaml

from the timaeus monorepo.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including timaeus/jaxgmg2_mixture_east