pi0.5-LoRA · Mug-into-Bowl (real-teleop, DROID warm start)

LoRA-finetuned pi0.5 for a single real-Franka pick-and-place task. Trained on 21 human teleop demos with language-conditioned action prediction.

Backbone: pi0.5 (3B, PaliGemma-2B + 300M action expert)
Warm start: gs://openpi-assets/checkpoints/pi05_droid/params — pi0.5 already SFT'd on the full DROID dataset (~75k real-teleop episodes)
LoRA variants: paligemma_variant="gemma_2b_lora" + action_expert_variant="gemma_300m_lora", backbone frozen via Pi0Config.get_freeze_filter()
Training objective: flow-matching action prediction, 16-step action chunks, action_dim=32 (padded)

Task

Language prompt used at training time (and required at inference):

"pick up the smallest mug and place it in the bowl"

Single-arm Franka with one exterior RGB camera + one wrist RGB camera.

Training

Field	Value
Config	`pi05_droid_finetune_lora` (custom, extends openpi's `pi05_droid_finetune`)
Dataset	`IDEAS-Lab-Northwestern/real-mug-into-bowl-droid` (private)
Episodes / frames	21 / 6603
FPS	15
Image res	320 × 180, center-cropped from 640 × 480 to 16:9
Cameras	`exterior_image_1_left` (main) + `wrist_image_left`. `exterior_image_2_left` filled with zeros (not used by `DroidInputs`)
Batch size	4
Steps kept	5000, 10000, 15000, 20000
EMA	disabled (LoRA)
Optimizer	AdamW, default openpi LR schedule
Hardware	1 × A100 80GB
Wall clock	~ h

Action & state convention (DROID schema)

State (8D, observation.* separately):

joint_position (7) — Franka joint angles, rad
gripper_position (1) — normalized 0 (open) – 1 (closed)

Action (8D, single actions column):

[0:7] joint velocity, rad/s
[7] target gripper position, normalized 0–1 — binarized to {~0.05, ~0.99} in demos due to bimodal teleop behavior

Norm stats: reused from openpi's official DROID pretraining (gs://openpi-assets/checkpoints/pi05_droid/assets/droid) — do not recompute on this small dataset or statistics will be overfit.

Repo layout

5000/   {params/, assets/, metrics/}
10000/  {params/, assets/, metrics/}
15000/  {params/, assets/, metrics/}
20000/  {params/, assets/, metrics/}

Each step is a self-contained orbax checkpoint. train_state/ (optimizer states) is intentionally omitted — these ckpts are for inference and LoRA-delta reuse, not for resuming training.

Usage

Download a single step

HF_HUB_DISABLE_XET=1 huggingface-cli download \
  IDEAS-Lab-Northwestern/pi05-mug-into-bowl-droid-lora \
  --include "20000/**" \
  --local-dir ./pi05-mug-into-bowl-droid-lora

Serve via openpi

The training config must be registered in the consuming openpi repo (see our internal doc docs/openpi_real_teleop_sft.md §5 for the TrainConfig definition).

uv run scripts/serve_policy.py \
  --policy.config=pi05_droid_finetune_lora \
  --policy.dir=./pi05-mug-into-bowl-droid-lora/20000

Then connect your real-Franka WebSocket client and send the task prompt above.

Intended use

Evaluating pi0.5 generalization / LoRA sufficiency on a narrow single-task real-teleop distribution.
Qualitative comparison against pi05-mug-into-bowl-base-lora (same LoRA setup, warm-started from pi05_base instead of pi05_droid).

Limitations & known caveats

Single task, tiny dataset. 21 episodes is small; the policy is unlikely to generalize off-distribution (different lighting, mug shapes, bowl positions outside demo range).
DROID + LoRA is not officially recommended: openpi's own README notes LoRA hasn't matched full-finetune performance on DROID. This release is a pragmatic compromise for consumer-GPU eval, not an SOTA training recipe.
exterior_image_2_left is zero. DroidInputs reads only exterior_1 and wrist at inference, so this is fine for serving — but don't try to re-use this ckpt with any policy transform that expects two real exterior views.
State axis-angle / rotation: not used at all by the DROID action space (action is joint-velocity + gripper); observation.cartesian_position is unused. Don't feed cartesian deltas.
FPS mismatch with openpi's default (15 Hz here vs DROID pretraining at 15 Hz – aligned). If you change the teleop rate for future data, you must re-train.

Citation / provenance

Trained with openpi by the IDEAS Lab @ Northwestern (internal use). Data collection: 2026-04, single-operator teleop with an SO-ARM leader.

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Robotics