Octo Maniskill RPD Weights

This repo contains the Octo weights used in Refined Policy Distillation (RPD). RPD distills VLAs into small expert policies using online Reinforcement Learning.

Paper: Refined Policy Distillation: From VLA Generalists to RL Experts Project Page: https://refined-policy-distillation.github.io Code: https://github.com/Refined-Policy-Distillation/RPD

The dataset used to fine-tune this checkpoint can be found here.

Also checkout the RPD OpenVLA weights.

Usage

Adapted from the Octo Repo

from octo.model.octo_model import OctoModel

model = OctoModel.load_pretrained("hf://Juelg/octo-base-1.5-finetuned-maniskill")
task = model.create_tasks(texts=["pick the cube"])
action = model.sample_actions(observation, task, rng=jax.random.PRNGKey(0))

For details on how Octo was used in RPD checkout the RPD Code Repo and the Agents library.

Citation

If you find RPD useful for your work, please consider citing it:

@inproceedings{juelg2025refinedpolicydistillationvla,
    title={{Refined Policy Distillation}: {F}rom {VLA} Generalists to {RL} Experts},
    author={Tobias Jülg and Wolfram Burgard and Florian Walter},
    year={2025},
    booktitle={Proc.~of the IEEE/RSJ Int.~Conf.~on Intelligent Robots and Systems (IROS)},
    note={Accepted for publication.}
}

Downloads last month: 13

Video Preview

Robotics

Paper for Juelg/octo-base-1.5-finetuned-maniskill

Refined Policy Distillation: From VLA Generalists to RL Experts

Paper • 2503.05833 • Published Mar 6, 2025