CoRL2026-CSI/IsaacLab-SO101_pick_place_baseCaP_100epi_10fps
Viewer โข Updated โข 34.4k โข 77
How to use CoRL2026-CSI/smolVLA-IsaacLab-picknplace-50epoch with LeRobot:
# See https://github.com/huggingface/lerobot?tab=readme-ov-file#installation for more details git clone https://github.com/huggingface/lerobot.git cd lerobot pip install -e .[smolvla]
# Launch finetuning on your dataset python lerobot/scripts/train.py \ --policy.path=CoRL2026-CSI/smolVLA-IsaacLab-picknplace-50epoch \ --dataset.repo_id=lerobot/svla_so101_pickplace \ --batch_size=64 \ --steps=20000 \ --output_dir=outputs/train/my_smolvla \ --job_name=my_smolvla_training \ --policy.device=cuda \ --wandb.enable=true
# Run the policy using the record function
python -m lerobot.record \
--robot.type=so101_follower \
--robot.port=/dev/ttyACM0 \ # <- Use your port
--robot.id=my_blue_follower_arm \ # <- Use your robot id
--robot.cameras="{ front: {type: opencv, index_or_path: 8, width: 640, height: 480, fps: 30}}" \ # <- Use your cameras
--dataset.single_task="Grasp a lego block and put it in the bin." \ # <- Use the same task description you used in your dataset recording
--dataset.repo_id=HF_USER/dataset_name \ # <- This will be the dataset name on HF Hub
--dataset.episode_time_s=50 \
--dataset.num_episodes=10 \
--policy.path=CoRL2026-CSI/smolVLA-IsaacLab-picknplace-50epochlerobot/smolvla_base ๋ฅผ IsaacLab ์๋ฎฌ๋ ์ด์ SO101 pick & place ๋จ์ผ task ๋ฐ์ดํฐ์ CoRL2026-CSI/IsaacLab-SO101_pick_place_baseCaP_100epi_10fps ์ผ๋ก 50 epoch ํ์ธํ๋ํ SmolVLA ์ ์ฑ .
์ด ์ฒดํฌํฌ์ธํธ๋ full model (model.safetensors) ์
๋๋ค โ LoRA adapter ๊ฐ ์๋๋ฉฐ, ๊ทธ๋๋ก ๋ก๋ํด ์ฌ์ฉํฉ๋๋ค.
lerobot/smolvla_base (SmolVLM2-500M-Video-Instruct VLM + action expert)top, left_wrist (480ร640) โ ์ ์ฑ
ํค camera1(left_wrist) / camera2(top) ๋ก renameobservation.state[6] + ์นด๋ฉ๋ผ 2๊ฐ + language instruction (task)action[6] (joint position)chunk_size=50, n_action_steps=50VLM frozen + action expert only โ SmolVLA ๊ณต์ ํ์ค ํ์ต ๋ฐฉ์ (SmolVLA paper, arXiv:2506.01844).
| ๊ตฌ์ฑ์์ | ์ํ |
|---|---|
| VLM backbone (SmolVLM2) | โ๏ธ ์์ Frozen (freeze_vision_encoder=true) |
| Action expert | ๐ฅ ํ์ต (train_expert_only=true) |
| PEFT / LoRA | ์ฌ์ฉ ์ ํจ |
| ํญ๋ชฉ | ๊ฐ |
|---|---|
| Dataset | IsaacLab-SO101_pick_place_baseCaP_100epi_10fps โ 100 episodes / 34,264 frames / 10 fps |
| Epochs / Steps | 50 epoch / 6,700 steps |
| Global batch size | 256 (micro batch 128 ร 2 GPU) |
| Optimizer | AdamW โ lr 1e-4, weight_decay 1e-10, grad_clip_norm 10.0 |
| LR scheduler | cosine_decay_with_warmup โ warmup 1,000 / decay 30,000 / peak_lr 1e-4 / decay_lr 2.5e-6 |
| chunk_size / n_action_steps | 50 / 50 |
| Seed | 1000 |
| Dataloader workers | 16 |
| Mixed precision | no (bf16 inference) |
| Image augmentation | ColorJitter (brightness/contrast/saturation/hue) + SharpnessJitter โ ๊ธฐํํ์ ๋ณํ(ํ์ /์ด๋/๋ฐ์ ) ์์ (VLA ์ข์ฐ ์๋ฏธ ๋ณด์กด) |
| Hardware | 2 ร NVIDIA H100 80GB |
| Final loss | 0.013 |
LeRobot dataset ์ ์นด๋ฉ๋ผ ํค์ SmolVLA ์ ์ฑ ํค ๋งคํ:
| Dataset key | Policy key |
|---|---|
observation.images.left_wrist |
observation.images.camera1 |
observation.images.top |
observation.images.camera2 |
observation.state[6] (joint position) + ์นด๋ฉ๋ผ 2๊ฐ + language instruction(task) ๋งaction[6] (joint position) ๋งee_pos / gripper_binary / state.radian_urdf0 / action.radian_urdf0 ๋ ํ์ต์์ ์ ์ธcamera1/2/3)๋ก ๊ณ ์ ์ด๋ผ camera3 ์ฌ๋กฏ์ด config ์ ์กด์ฌํ์ง๋ง, ๋ฐ์ดํฐ์
์นด๋ฉ๋ผ๋ 2๊ฐ๋ฟ์ด๋ผ ์ค์ ๋ก ๋ฐ์ดํฐ๊ฐ ํ๋ฅด๋ ์นด๋ฉ๋ผ๋ 2๊ฐ์
๋๋ค.from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy
policy = SmolVLAPolicy.from_pretrained("CoRL2026-CSI/smolVLA-IsaacLab-picknplace-50epoch")
Built on top of LeRobot and the SmolVLA base checkpoint. Project: CoRL 2026 CSI submission.
Base model
lerobot/smolvla_base