YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Not strictly obsolete, but reduced precision found to add noise to training dynamics and was discontinued.

A series of training runs with alpha=0.47, gamma (discount_rate) = 0.99 with bfloat16 reduced precision. We found that the training runs were not nearly as stable (wandb here) and so this path was abandonded. Models kept for posterity.

Hyperparams:

rl_action=train
num_rollout_steps=64
lr=5e-05
discount_rate=0.99
eff_horizon=None
eval_every=1
use_wandb=True
use_hf=True
use_log=True
num_total_env_steps=5000000000
checkpoint=al_0.47_g_0.99_100_bf16
render_sixel=True
sixel_loc=(7, 7)
seed=100
mask_type=first_episode
penalize_time=False
optim=adam
live_monitor=False
use_bf16=True
checkpoint_schedule=0:8
grad_acc_per_chunk=16
num_rollout_chunks=1
cheese_loc=any
env_layout=open
alpha=0.47
env_size=13
num_levels=9600
f_str_ckpt=al_{alpha}_g_{discount_rate}_{seed}_bf16
wandb_project=jaxgmg_3phase_bf16
ckpt_dir=jaxgmg_3phase_bf16
duplication_factor=-1
smoke=False
num_chains=6
num_draws=3000
on_policy=True
nbeta=3000
localization=10
exact_solver_each_draw=False
llc_optimizer=sgld
iw_clip_eps=None
rmsprop_burnin=20
llc_data_file=llc_scan_open_reinforce.pkl
llc_checkpoint_index=0
repo_id=davidquarel/jaxgmg_ckpt_zip
use_shuffled_checkpoints=0
force_re_download=False

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including timaeus/jaxgmg_3phase_bf16

Project: RL1/RL2 (obsolete)

Collection

Older models that are no longer useful for anything in RL1 or RL2, or are now unused as experimentation discontinued. • 16 items • Updated 8 days ago