---
license: mit
library_name: pytorch
tags:
  - representation-learning
  - offline-rl
  - policy-representation
  - inr
  - cvae
  - meta-learning
  - mujoco
  - lichess
  - droid
  - fastf1
  - dmlab
---

# policyINR — Checkpoints

Companion checkpoint repository for **[andrewkang12345/policyINR](https://github.com/andrewkang12345/policyINR)** —
robust policy representation learning from offline data.

The folder layout mirrors the `outputs/` tree of the source repo:

```
<domain>/<suite>/<run>/best.pt
                        last.pt
```

where `<domain> ∈ {lichess, droid, fastf1, mujoco, syntheticgrf,
dmlabseekavoid}`.

Every per-run dir on the GitHub side carries the matching `config.yaml`,
`metrics.jsonl`, `summary.json`, and `eval.json` — clone the repo and the
identifiers line up 1-to-1 with the paths here.

## Download example

```python
from huggingface_hub import snapshot_download
snapshot_download(
    repo_id="andrewkang12345/policyINR-checkpoints",
    repo_type="model",
    allow_patterns=["lichess/2x_hk240/**/*.pt"],
    local_dir="outputs",
)
```

## Domains

| domain | what it is |
|---|---|
| `lichess` | Top-3 GM Lichess games — discrete UCI-move action |
| `droid` | DROID lowdim teleop — continuous action across collectors |
| `fastf1` | F1 stint telemetry — driver-as-policy |
| `mujoco` | Custom MuJoCo + Minari baselines + state/action-resampled suites |
| `syntheticgrf` | Synthetic Gaussian random-field policies (sanity / smoke) |
| `dmlabseekavoid` | DMLab `seekavoid_arena_01` from RL Unplugged — discrete action |

## Models in each run name

`<data>__<model>__<experiment>__s<seed>/`

- `cvae` — bag-of-pairs CVAE
- `inr_transformer_history_conditioned` — INR-Transformer w/ history
- `inr_diffusion_history_conditioned` — INR-Diffusion w/ history
- `inr_transformer_fitted_latent` — per-unit fitted latent codes
- `inr_transformer_infer_latent[_maml]` — meta-learned per-unit latent
  (10-step inner adapt + early stopping)

## License

MIT (mirrors the source repo).