File size: 2,330 Bytes
d0c8458 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | ---
license: mit
tags:
- reinforcement-learning
- offline-rl
- flow-matching
- robotics
- jax
datasets:
- ogbench
- robomimic
---
# Flow Map Policies — Pretrained FMQ Checkpoints
Pretrained checkpoints for **"Aligning Flow Map Policies with Optimal Q-Guidance"**.
**Paper:** [arXiv:2605.12416](https://arxiv.org/abs/2605.12416)
**Code:** [github.com/christoszi/flow-map-policies](https://github.com/christoszi/flow-map-policies)
## Model Description
These are Flow Map Q-Guidance (FMQ) agents trained with offline-to-online RL. Each checkpoint contains a flow map policy fine-tuned online for 1M steps using critic-guided trust-region optimization.
## Checkpoints
12 environments x 5 random seeds = 60 checkpoints total.
| Folder | Environment | Benchmark |
|--------|-------------|-----------|
| `checkpoints/ctrp4/` | cube-triple-play-singletask-task4-v0 | OGBench |
| `checkpoints/ctrp3/` | cube-triple-play-singletask-task3-v0 | OGBench |
| `checkpoints/cdp4/` | cube-double-play-singletask-task4-v0 | OGBench |
| `checkpoints/cdp3/` | cube-double-play-singletask-task3-v0 | OGBench |
| `checkpoints/sc4/` | scene-play-singletask-task4-v0 | OGBench |
| `checkpoints/sc5/` | scene-play-singletask-task5-v0 | OGBench |
| `checkpoints/ag4/` | antmaze-giant-navigate-singletask-task4-v0 | OGBench |
| `checkpoints/ag5/` | antmaze-giant-navigate-singletask-task5-v0 | OGBench |
| `checkpoints/hm3/` | humanoidmaze-medium-navigate-singletask-task3-v0 | OGBench |
| `checkpoints/hm4/` | humanoidmaze-medium-navigate-singletask-task4-v0 | OGBench |
| `checkpoints/can/` | can-mh-low_dim | RoboMimic |
| `checkpoints/square/` | square-mh-low_dim | RoboMimic |
## Usage
```bash
pip install huggingface_hub
python -c "from huggingface_hub import snapshot_download; snapshot_download('christoszi/flow-map-policies', local_dir='.')"
```
Then evaluate:
```bash
python main.py --config configs/config.yaml \
--eval_only --fmq_online \
--restore_path=checkpoints/ctrp4/params_online_sd000.pkl \
--env_name=cube-triple-play-singletask-task4-v0 --seed=0
```
## Citation
```bibtex
@article{ziakas2026fmq,
title={Aligning Flow Map Policies with Optimal Q-Guidance},
author={Ziakas, Christos and Russo, Alessandra and Bose, Avishek Joey},
journal={arXiv preprint arXiv:2605.12416},
year={2026},
}
```
|