andrewkang12345's picture
Upload README.md with huggingface_hub
907b271 verified
---
license: mit
library_name: pytorch
tags:
- representation-learning
- offline-rl
- policy-representation
- inr
- cvae
- meta-learning
- mujoco
- lichess
- droid
- fastf1
- dmlab
---
# policyINR β€” Checkpoints
Companion checkpoint repository for **[andrewkang12345/policyINR](https://github.com/andrewkang12345/policyINR)** β€”
robust policy representation learning from offline data.
The folder layout mirrors the `outputs/` tree of the source repo:
```
<domain>/<suite>/<run>/best.pt
last.pt
```
where `<domain> ∈ {lichess, droid, fastf1, mujoco, syntheticgrf,
dmlabseekavoid}`.
Every per-run dir on the GitHub side carries the matching `config.yaml`,
`metrics.jsonl`, `summary.json`, and `eval.json` β€” clone the repo and the
identifiers line up 1-to-1 with the paths here.
## Download example
```python
from huggingface_hub import snapshot_download
snapshot_download(
repo_id="andrewkang12345/policyINR-checkpoints",
repo_type="model",
allow_patterns=["lichess/2x_hk240/**/*.pt"],
local_dir="outputs",
)
```
## Domains
| domain | what it is |
|---|---|
| `lichess` | Top-3 GM Lichess games β€” discrete UCI-move action |
| `droid` | DROID lowdim teleop β€” continuous action across collectors |
| `fastf1` | F1 stint telemetry β€” driver-as-policy |
| `mujoco` | Custom MuJoCo + Minari baselines + state/action-resampled suites |
| `syntheticgrf` | Synthetic Gaussian random-field policies (sanity / smoke) |
| `dmlabseekavoid` | DMLab `seekavoid_arena_01` from RL Unplugged β€” discrete action |
## Models in each run name
`<data>__<model>__<experiment>__s<seed>/`
- `cvae` β€” bag-of-pairs CVAE
- `inr_transformer_history_conditioned` β€” INR-Transformer w/ history
- `inr_diffusion_history_conditioned` β€” INR-Diffusion w/ history
- `inr_transformer_fitted_latent` β€” per-unit fitted latent codes
- `inr_transformer_infer_latent[_maml]` β€” meta-learned per-unit latent
(10-step inner adapt + early stopping)
## License
MIT (mirrors the source repo).