policyINR β€” Checkpoints

Companion checkpoint repository for andrewkang12345/policyINR β€” robust policy representation learning from offline data.

The folder layout mirrors the outputs/ tree of the source repo:

<domain>/<suite>/<run>/best.pt
                        last.pt

where <domain> ∈ {lichess, droid, fastf1, mujoco, syntheticgrf, dmlabseekavoid}.

Every per-run dir on the GitHub side carries the matching config.yaml, metrics.jsonl, summary.json, and eval.json β€” clone the repo and the identifiers line up 1-to-1 with the paths here.

Download example

from huggingface_hub import snapshot_download
snapshot_download(
    repo_id="andrewkang12345/policyINR-checkpoints",
    repo_type="model",
    allow_patterns=["lichess/2x_hk240/**/*.pt"],
    local_dir="outputs",
)

Domains

domain what it is
lichess Top-3 GM Lichess games β€” discrete UCI-move action
droid DROID lowdim teleop β€” continuous action across collectors
fastf1 F1 stint telemetry β€” driver-as-policy
mujoco Custom MuJoCo + Minari baselines + state/action-resampled suites
syntheticgrf Synthetic Gaussian random-field policies (sanity / smoke)
dmlabseekavoid DMLab seekavoid_arena_01 from RL Unplugged β€” discrete action

Models in each run name

<data>__<model>__<experiment>__s<seed>/

  • cvae β€” bag-of-pairs CVAE
  • inr_transformer_history_conditioned β€” INR-Transformer w/ history
  • inr_diffusion_history_conditioned β€” INR-Diffusion w/ history
  • inr_transformer_fitted_latent β€” per-unit fitted latent codes
  • inr_transformer_infer_latent[_maml] β€” meta-learned per-unit latent (10-step inner adapt + early stopping)

License

MIT (mirrors the source repo).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support