pi0.5-LoRA · Mug-into-Bowl (real-teleop, DROID warm start)

LoRA-finetuned pi0.5 for a single real-Franka pick-and-place task. Trained on 21 human teleop demos with language-conditioned action prediction.

  • Backbone: pi0.5 (3B, PaliGemma-2B + 300M action expert)
  • Warm start: gs://openpi-assets/checkpoints/pi05_droid/params — pi0.5 already SFT'd on the full DROID dataset (~75k real-teleop episodes)
  • LoRA variants: paligemma_variant="gemma_2b_lora" + action_expert_variant="gemma_300m_lora", backbone frozen via Pi0Config.get_freeze_filter()
  • Training objective: flow-matching action prediction, 16-step action chunks, action_dim=32 (padded)

Task

Language prompt used at training time (and required at inference):

"pick up the smallest mug and place it in the bowl"

Single-arm Franka with one exterior RGB camera + one wrist RGB camera.

Training

Field Value
Config pi05_droid_finetune_lora (custom, extends openpi's pi05_droid_finetune)
Dataset IDEAS-Lab-Northwestern/real-mug-into-bowl-droid (private)
Episodes / frames 21 / 6603
FPS 15
Image res 320 × 180, center-cropped from 640 × 480 to 16:9
Cameras exterior_image_1_left (main) + wrist_image_left. exterior_image_2_left filled with zeros (not used by DroidInputs)
Batch size 4
Steps kept 5000, 10000, 15000, 20000
EMA disabled (LoRA)
Optimizer AdamW, default openpi LR schedule
Hardware 1 × A100 80GB
Wall clock ~ h

Action & state convention (DROID schema)

State (8D, observation.* separately):

  • joint_position (7) — Franka joint angles, rad
  • gripper_position (1) — normalized 0 (open) – 1 (closed)

Action (8D, single actions column):

  • [0:7] joint velocity, rad/s
  • [7] target gripper position, normalized 0–1 — binarized to {~0.05, ~0.99} in demos due to bimodal teleop behavior

Norm stats: reused from openpi's official DROID pretraining (gs://openpi-assets/checkpoints/pi05_droid/assets/droid) — do not recompute on this small dataset or statistics will be overfit.

Repo layout

5000/   {params/, assets/, metrics/}
10000/  {params/, assets/, metrics/}
15000/  {params/, assets/, metrics/}
20000/  {params/, assets/, metrics/}

Each step is a self-contained orbax checkpoint. train_state/ (optimizer states) is intentionally omitted — these ckpts are for inference and LoRA-delta reuse, not for resuming training.

Usage

Download a single step

HF_HUB_DISABLE_XET=1 huggingface-cli download \
  IDEAS-Lab-Northwestern/pi05-mug-into-bowl-droid-lora \
  --include "20000/**" \
  --local-dir ./pi05-mug-into-bowl-droid-lora

Serve via openpi

The training config must be registered in the consuming openpi repo (see our internal doc docs/openpi_real_teleop_sft.md §5 for the TrainConfig definition).

uv run scripts/serve_policy.py \
  --policy.config=pi05_droid_finetune_lora \
  --policy.dir=./pi05-mug-into-bowl-droid-lora/20000

Then connect your real-Franka WebSocket client and send the task prompt above.

Intended use

  • Evaluating pi0.5 generalization / LoRA sufficiency on a narrow single-task real-teleop distribution.
  • Qualitative comparison against pi05-mug-into-bowl-base-lora (same LoRA setup, warm-started from pi05_base instead of pi05_droid).

Limitations & known caveats

  • Single task, tiny dataset. 21 episodes is small; the policy is unlikely to generalize off-distribution (different lighting, mug shapes, bowl positions outside demo range).
  • DROID + LoRA is not officially recommended: openpi's own README notes LoRA hasn't matched full-finetune performance on DROID. This release is a pragmatic compromise for consumer-GPU eval, not an SOTA training recipe.
  • exterior_image_2_left is zero. DroidInputs reads only exterior_1 and wrist at inference, so this is fine for serving — but don't try to re-use this ckpt with any policy transform that expects two real exterior views.
  • State axis-angle / rotation: not used at all by the DROID action space (action is joint-velocity + gripper); observation.cartesian_position is unused. Don't feed cartesian deltas.
  • FPS mismatch with openpi's default (15 Hz here vs DROID pretraining at 15 Hz – aligned). If you change the teleop rate for future data, you must re-train.

Citation / provenance

Trained with openpi by the IDEAS Lab @ Northwestern (internal use). Data collection: 2026-04, single-operator teleop with an SO-ARM leader.

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading