--- language: - en license: mit tags: - pi05 - pi0.5 - LeRobot - robotics - imitation-learning - behavior-cloning - so101 pipeline_tag: reinforcement-learning library_name: lerobot base_model: - lerobot/pi05_base --- # LeRobot SO101 Pi05 task2task3-all_bs32_s20000 ## Summary This repository contains the final checkpoint for a Pi0.5 (pi05) policy fine-tune trained on `aswinkumar99/task2task3-all` for SO101 sponge pick-and-place experiments. Dataset meaning: Task 2 + Task 3 combined (all layouts). This pi05 policy is a fine-tune of `lerobot/pi05_base`, as recorded by both the launch command (`--policy.path=lerobot/pi05_base`) and the saved training config (`pretrained_path: lerobot/pi05_base`). ## Training Setup - Dataset repo: `aswinkumar99/task2task3-all` - Local dataset root during training: `/root/datasets_combined/aswinkumar99/task2task3-all` - Output directory during training: `/root/outputs_matrix/pi05/task2task3-all_bs32_s20000` - Batch size: `32` - Training steps: `20000` - Checkpoint save frequency: `5000` - Data loader workers: `8` - WandB project: `so101-layout-generalization` - GPU: `NVIDIA H200` - Python: `CPython 3.10.12` - CUDA: `13.1` - Training start: `2026-04-24T13:03:21.511729` - Training end: `2026-04-24T17:37:44.320524` - Approximate training duration: `4h 34m 22s` - Base model: `lerobot/pi05_base` - Observation camera rename map: `{"observation.images.overhead": "observation.images.base_0_rgb", "observation.images.wrist": "observation.images.right_wrist_0_rgb"}` - Action chunk size: `50` - Action steps predicted: `50` ## Exact Training Command ```bash lerobot-train \ --dataset.repo_id=aswinkumar99/task2task3-all \ --dataset.root=/root/datasets_combined/aswinkumar99/task2task3-all \ --dataset.video_backend=torchcodec \ --output_dir=/root/outputs_matrix/pi05/task2task3-all_bs32_s20000 \ --job_name=pi05_task2task3-all_bs32 \ --batch_size=32 \ --steps=20000 \ --log_freq=200 \ --save_freq=5000 \ --save_checkpoint=true \ --num_workers=8 \ --wandb.enable=true \ --wandb.project=so101-layout-generalization \ --wandb.mode=online \ --wandb.disable_artifact=true \ --policy.path=lerobot/pi05_base \ --policy.device=cuda \ --policy.push_to_hub=false \ --rename_map={"observation.images.overhead": "observation.images.base_0_rgb", "observation.images.wrist": "observation.images.right_wrist_0_rgb"} ``` ## Repository Contents - `pretrained_model/`: final downloadable model artifacts for inference/loading - `training_state/`: optimizer, RNG, scheduler/state, and step information for resuming or auditability ## Notes - This repo stores the final checkpoint (step `20000`) that was uploaded from the cloud training workspace. - Intermediate checkpoints (every `5000` steps) are archived to Google Drive as tarballs and are not pushed to the Hub. - The checkpoint was trained with LeRobot tooling via `lerobot-train`. - For SO101 experiments in this workspace, the dataset source was created by Aswinkumar. ## Creator Aswinkumar - Website: [aswinkumar.me](https://aswinkumar.me) - Hugging Face repo: