| --- |
| license: mit |
| library_name: pytorch |
| tags: |
| - representation-learning |
| - offline-rl |
| - policy-representation |
| - inr |
| - cvae |
| - meta-learning |
| - mujoco |
| - lichess |
| - droid |
| - fastf1 |
| - dmlab |
| --- |
| |
| # policyINR β Checkpoints |
|
|
| Companion checkpoint repository for **[andrewkang12345/policyINR](https://github.com/andrewkang12345/policyINR)** β |
| robust policy representation learning from offline data. |
|
|
| The folder layout mirrors the `outputs/` tree of the source repo: |
|
|
| ``` |
| <domain>/<suite>/<run>/best.pt |
| last.pt |
| ``` |
|
|
| where `<domain> β {lichess, droid, fastf1, mujoco, syntheticgrf, |
| dmlabseekavoid}`. |
|
|
| Every per-run dir on the GitHub side carries the matching `config.yaml`, |
| `metrics.jsonl`, `summary.json`, and `eval.json` β clone the repo and the |
| identifiers line up 1-to-1 with the paths here. |
|
|
| ## Download example |
|
|
| ```python |
| from huggingface_hub import snapshot_download |
| snapshot_download( |
| repo_id="andrewkang12345/policyINR-checkpoints", |
| repo_type="model", |
| allow_patterns=["lichess/2x_hk240/**/*.pt"], |
| local_dir="outputs", |
| ) |
| ``` |
|
|
| ## Domains |
|
|
| | domain | what it is | |
| |---|---| |
| | `lichess` | Top-3 GM Lichess games β discrete UCI-move action | |
| | `droid` | DROID lowdim teleop β continuous action across collectors | |
| | `fastf1` | F1 stint telemetry β driver-as-policy | |
| | `mujoco` | Custom MuJoCo + Minari baselines + state/action-resampled suites | |
| | `syntheticgrf` | Synthetic Gaussian random-field policies (sanity / smoke) | |
| | `dmlabseekavoid` | DMLab `seekavoid_arena_01` from RL Unplugged β discrete action | |
|
|
| ## Models in each run name |
|
|
| `<data>__<model>__<experiment>__s<seed>/` |
|
|
| - `cvae` β bag-of-pairs CVAE |
| - `inr_transformer_history_conditioned` β INR-Transformer w/ history |
| - `inr_diffusion_history_conditioned` β INR-Diffusion w/ history |
| - `inr_transformer_fitted_latent` β per-unit fitted latent codes |
| - `inr_transformer_infer_latent[_maml]` β meta-learned per-unit latent |
| (10-step inner adapt + early stopping) |
|
|
| ## License |
|
|
| MIT (mirrors the source repo). |
|
|