policyINR β Checkpoints
Companion checkpoint repository for andrewkang12345/policyINR β robust policy representation learning from offline data.
The folder layout mirrors the outputs/ tree of the source repo:
<domain>/<suite>/<run>/best.pt
last.pt
where <domain> β {lichess, droid, fastf1, mujoco, syntheticgrf, dmlabseekavoid}.
Every per-run dir on the GitHub side carries the matching config.yaml,
metrics.jsonl, summary.json, and eval.json β clone the repo and the
identifiers line up 1-to-1 with the paths here.
Download example
from huggingface_hub import snapshot_download
snapshot_download(
repo_id="andrewkang12345/policyINR-checkpoints",
repo_type="model",
allow_patterns=["lichess/2x_hk240/**/*.pt"],
local_dir="outputs",
)
Domains
| domain | what it is |
|---|---|
lichess |
Top-3 GM Lichess games β discrete UCI-move action |
droid |
DROID lowdim teleop β continuous action across collectors |
fastf1 |
F1 stint telemetry β driver-as-policy |
mujoco |
Custom MuJoCo + Minari baselines + state/action-resampled suites |
syntheticgrf |
Synthetic Gaussian random-field policies (sanity / smoke) |
dmlabseekavoid |
DMLab seekavoid_arena_01 from RL Unplugged β discrete action |
Models in each run name
<data>__<model>__<experiment>__s<seed>/
cvaeβ bag-of-pairs CVAEinr_transformer_history_conditionedβ INR-Transformer w/ historyinr_diffusion_history_conditionedβ INR-Diffusion w/ historyinr_transformer_fitted_latentβ per-unit fitted latent codesinr_transformer_infer_latent[_maml]β meta-learned per-unit latent (10-step inner adapt + early stopping)
License
MIT (mirrors the source repo).
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support