pi0.5-LoRA · Mug-into-Bowl (real-teleop, DROID warm start)
LoRA-finetuned pi0.5 for a single real-Franka pick-and-place task. Trained on 21 human teleop demos with language-conditioned action prediction.
- Backbone: pi0.5 (3B, PaliGemma-2B + 300M action expert)
- Warm start:
gs://openpi-assets/checkpoints/pi05_droid/params— pi0.5 already SFT'd on the full DROID dataset (~75k real-teleop episodes) - LoRA variants:
paligemma_variant="gemma_2b_lora"+action_expert_variant="gemma_300m_lora", backbone frozen viaPi0Config.get_freeze_filter() - Training objective: flow-matching action prediction, 16-step action chunks,
action_dim=32(padded)
Task
Language prompt used at training time (and required at inference):
"pick up the smallest mug and place it in the bowl"
Single-arm Franka with one exterior RGB camera + one wrist RGB camera.
Training
| Field | Value |
|---|---|
| Config | pi05_droid_finetune_lora (custom, extends openpi's pi05_droid_finetune) |
| Dataset | IDEAS-Lab-Northwestern/real-mug-into-bowl-droid (private) |
| Episodes / frames | 21 / 6603 |
| FPS | 15 |
| Image res | 320 × 180, center-cropped from 640 × 480 to 16:9 |
| Cameras | exterior_image_1_left (main) + wrist_image_left. exterior_image_2_left filled with zeros (not used by DroidInputs) |
| Batch size | 4 |
| Steps kept | 5000, 10000, 15000, 20000 |
| EMA | disabled (LoRA) |
| Optimizer | AdamW, default openpi LR schedule |
| Hardware | 1 × A100 80GB |
| Wall clock | ~ h |
Action & state convention (DROID schema)
State (8D, observation.* separately):
joint_position(7) — Franka joint angles, radgripper_position(1) — normalized 0 (open) – 1 (closed)
Action (8D, single actions column):
[0:7]joint velocity, rad/s[7]target gripper position, normalized 0–1 — binarized to {~0.05, ~0.99} in demos due to bimodal teleop behavior
Norm stats: reused from openpi's official DROID pretraining
(gs://openpi-assets/checkpoints/pi05_droid/assets/droid) — do not recompute
on this small dataset or statistics will be overfit.
Repo layout
5000/ {params/, assets/, metrics/}
10000/ {params/, assets/, metrics/}
15000/ {params/, assets/, metrics/}
20000/ {params/, assets/, metrics/}
Each step is a self-contained orbax checkpoint. train_state/ (optimizer
states) is intentionally omitted — these ckpts are for inference and
LoRA-delta reuse, not for resuming training.
Usage
Download a single step
HF_HUB_DISABLE_XET=1 huggingface-cli download \
IDEAS-Lab-Northwestern/pi05-mug-into-bowl-droid-lora \
--include "20000/**" \
--local-dir ./pi05-mug-into-bowl-droid-lora
Serve via openpi
The training config must be registered in the consuming openpi repo (see
our internal doc docs/openpi_real_teleop_sft.md §5 for the
TrainConfig definition).
uv run scripts/serve_policy.py \
--policy.config=pi05_droid_finetune_lora \
--policy.dir=./pi05-mug-into-bowl-droid-lora/20000
Then connect your real-Franka WebSocket client and send the task prompt above.
Intended use
- Evaluating pi0.5 generalization / LoRA sufficiency on a narrow single-task real-teleop distribution.
- Qualitative comparison against
pi05-mug-into-bowl-base-lora(same LoRA setup, warm-started frompi05_baseinstead ofpi05_droid).
Limitations & known caveats
- Single task, tiny dataset. 21 episodes is small; the policy is unlikely to generalize off-distribution (different lighting, mug shapes, bowl positions outside demo range).
- DROID + LoRA is not officially recommended: openpi's own README notes LoRA hasn't matched full-finetune performance on DROID. This release is a pragmatic compromise for consumer-GPU eval, not an SOTA training recipe.
exterior_image_2_leftis zero.DroidInputsreads only exterior_1 and wrist at inference, so this is fine for serving — but don't try to re-use this ckpt with any policy transform that expects two real exterior views.- State axis-angle / rotation: not used at all by the DROID action
space (action is joint-velocity + gripper);
observation.cartesian_positionis unused. Don't feed cartesian deltas. - FPS mismatch with openpi's default (15 Hz here vs DROID pretraining at 15 Hz – aligned). If you change the teleop rate for future data, you must re-train.
Citation / provenance
Trained with openpi by the IDEAS Lab @ Northwestern (internal use). Data collection: 2026-04, single-operator teleop with an SO-ARM leader.