Instructions to use CoRL2026-CSI/smolvla_isaaclab_so101_11task_basecap_3300epi_8ep with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use CoRL2026-CSI/smolvla_isaaclab_so101_11task_basecap_3300epi_8ep with LeRobot:
# See https://github.com/huggingface/lerobot?tab=readme-ov-file#installation for more details git clone https://github.com/huggingface/lerobot.git cd lerobot pip install -e .[smolvla]
# Launch finetuning on your dataset python lerobot/scripts/train.py \ --policy.path=CoRL2026-CSI/smolvla_isaaclab_so101_11task_basecap_3300epi_8ep \ --dataset.repo_id=lerobot/svla_so101_pickplace \ --batch_size=64 \ --steps=20000 \ --output_dir=outputs/train/my_smolvla \ --job_name=my_smolvla_training \ --policy.device=cuda \ --wandb.enable=true
# Run the policy using the record function python -m lerobot.record \ --robot.type=so101_follower \ --robot.port=/dev/ttyACM0 \ # <- Use your port --robot.id=my_blue_follower_arm \ # <- Use your robot id --robot.cameras="{ front: {type: opencv, index_or_path: 8, width: 640, height: 480, fps: 30}}" \ # <- Use your cameras --dataset.single_task="Grasp a lego block and put it in the bin." \ # <- Use the same task description you used in your dataset recording --dataset.repo_id=HF_USER/dataset_name \ # <- This will be the dataset name on HF Hub --dataset.episode_time_s=50 \ --dataset.num_episodes=10 \ --policy.path=CoRL2026-CSI/smolvla_isaaclab_so101_11task_basecap_3300epi_8ep - Notebooks
- Google Colab
- Kaggle
SmolVLA IsaacLab SO101 11-task BaseCaP 3300epi (8ep)
This repository contains a SmolVLA policy checkpoint fine-tuned with LeRobot. The model card is intentionally detailed so the training run can be reproduced or debugged from the uploaded artifact.
Model Details
- Policy: SmolVLA
- Base checkpoint:
lerobot/smolvla_base - Training dataset:
CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi - Training script:
lerobot/scripts/train_smolvla.sh - Checkpoint: step
110000, approximately7.99epochs - Reported training loss at checkpoint:
0.004 - Resolved config:
train_config.json
Related checkpoints from the same run:
Dataset
| Key | Value |
|---|---|
Robot |
SO101 follower in IsaacLab |
Episodes |
3,300 |
Frames |
3,522,774 |
Tasks |
800 |
FPS |
30 |
Camera streams |
observation.images.left_wrist, observation.images.top |
Dataset state/action shape |
[6] / [6] |
Reproduction
The uploaded train_config.json is the authoritative serialized LeRobot config for this checkpoint. The table below mirrors the key values for quick inspection.
| Key | Value |
|---|---|
script |
lerobot/scripts/train_smolvla.sh |
job_name |
smolvla_20260508_093756 |
output_dir |
/home/work/hscho/corl_2026/AutoDataCollector/lerobot/outputs/train/smolvla_20260508_093756 |
seed |
1000 |
launch |
DDP via python -m accelerate.commands.launch --multi_gpu --num_processes=2 --mixed_precision=bf16 -m lerobot.scripts.lerobot_train |
checkpoint_step |
110000 |
checkpoint_epoch |
7.99 |
checkpoint_train_loss |
0.004 |
checkpoint_grad_norm |
0.051 |
checkpoint_lr |
2.5e-06 |
effective_batch |
128 x 2 = 256 |
Approximate script invocation:
cd /home/work/hscho/corl_2026/AutoDataCollector/lerobot
CONDA_ENV="lerobot" POLICY_TYPE="smolvla" POLICY_PATH="lerobot/smolvla_base" DATASET_REPO_ID="CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi" BATCH_SIZE="128" GRADIENT_ACCUMULATION_STEPS="1 (not set in script/config)" STEPS="110000" NUM_WORKERS="6" CUDA_VISIBLE_DEVICES="0, 1" NUM_GPUS="2" MIXED_PRECISION="bf16" SAVE_FREQ="28000" LOG_FREQ="10" EVAL_FREQ="0" WANDB_PROJECT="lerobot-smolvla" bash train_smolvla.sh
Detailed Hyperparameters
Script Defaults and Environment
| Key | Value |
|---|---|
CONDA_ENV |
lerobot |
POLICY_TYPE |
smolvla |
POLICY_PATH |
lerobot/smolvla_base |
DATASET_REPO_ID |
CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi |
BATCH_SIZE |
128 |
GRADIENT_ACCUMULATION_STEPS |
1 (not set in script/config) |
STEPS |
110000 |
NUM_WORKERS |
6 |
CUDA_VISIBLE_DEVICES |
0, 1 |
NUM_GPUS |
2 |
MIXED_PRECISION |
bf16 |
SAVE_FREQ |
28000 |
LOG_FREQ |
10 |
EVAL_FREQ |
0 |
WANDB_PROJECT |
lerobot-smolvla |
Training Loop and Dataloader
| Key | Value |
|---|---|
steps |
110000 |
batch_size |
128 |
gradient_accumulation_steps |
1 |
num_workers |
6 |
dataloader_prefetch_factor |
null |
dataloader_persistent_workers |
null |
dataloader_pin_memory |
null |
save_freq |
28000 |
log_freq |
10 |
eval_freq |
0 |
cudnn_deterministic |
False |
use_policy_training_preset |
True |
ddp_find_unused_parameters |
null |
profile_timing |
null |
Dataset Pipeline
| Key | Value |
|---|---|
dataset.repo_id |
CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi |
dataset.root |
null |
dataset.episodes |
null |
dataset.revision |
null |
dataset.use_imagenet_stats |
True |
dataset.video_backend |
torchcodec |
dataset.streaming |
False |
Image augmentation settings:
{
"enable": true,
"max_num_transforms": 3,
"random_order": true,
"tfs": {
"brightness": {
"weight": 1.0,
"type": "ColorJitter",
"kwargs": {
"brightness": [
0.8,
1.2
]
}
},
"contrast": {
"weight": 1.0,
"type": "ColorJitter",
"kwargs": {
"contrast": [
0.8,
1.2
]
}
},
"saturation": {
"weight": 1.0,
"type": "ColorJitter",
"kwargs": {
"saturation": [
0.5,
1.5
]
}
},
"hue": {
"weight": 1.0,
"type": "ColorJitter",
"kwargs": {
"hue": [
-0.05,
0.05
]
}
},
"sharpness": {
"weight": 1.0,
"type": "SharpnessJitter",
"kwargs": {
"sharpness": [
0.5,
1.5
]
}
},
"affine": {
"weight": 1.0,
"type": "RandomAffine",
"kwargs": {
"degrees": [
-5.0,
5.0
],
"translate": [
0.05,
0.05
]
}
}
}
}
Camera rename map:
{
"observation.images.left_wrist": "observation.images.camera1",
"observation.images.top": "observation.images.camera2"
}
Policy Configuration
{
"type": "smolvla",
"pretrained_path": "lerobot/smolvla_base",
"vlm_model_name": "HuggingFaceTB/SmolVLM2-500M-Video-Instruct",
"load_vlm_weights": true,
"num_vlm_layers": 16,
"freeze_vision_encoder": true,
"train_expert_only": true,
"train_state_proj": true,
"use_peft": false,
"use_amp": false,
"chunk_size": 50,
"n_action_steps": 50,
"num_steps": 10,
"max_state_dim": 32,
"max_action_dim": 32,
"resize_imgs_with_padding": [
512,
512
],
"tokenizer_max_length": 48,
"attention_mode": "cross_attn",
"pad_language_to": "max_length",
"use_cache": true,
"num_expert_layers": 0,
"expert_width_multiplier": 0.75,
"self_attn_every_n_layers": 2,
"min_period": 0.004,
"max_period": 4.0,
"compile_model": false,
"compile_mode": "max-autotune",
"normalization_mapping": {
"VISUAL": "IDENTITY",
"STATE": "MEAN_STD",
"ACTION": "MEAN_STD"
},
"input_features": {
"observation.state": {
"type": "STATE",
"shape": [
6
]
},
"observation.images.camera1": {
"type": "VISUAL",
"shape": [
3,
256,
256
]
},
"observation.images.camera2": {
"type": "VISUAL",
"shape": [
3,
256,
256
]
},
"observation.images.camera3": {
"type": "VISUAL",
"shape": [
3,
256,
256
]
}
},
"output_features": {
"action": {
"type": "ACTION",
"shape": [
6
]
},
"action.radian_urdf0": {
"type": "ACTION",
"shape": [
6
]
}
}
}
Optimizer
{
"type": "adamw",
"lr": 0.0001,
"weight_decay": 1e-10,
"grad_clip_norm": 10.0,
"betas": [
0.9,
0.95
],
"eps": 1e-08
}
Scheduler
{
"type": "cosine_decay_with_warmup",
"num_warmup_steps": 1000,
"num_decay_steps": 30000,
"peak_lr": 0.0001,
"decay_lr": 2.5e-06
}
Logging
{
"enable": true,
"disable_artifact": false,
"project": "lerobot-smolvla",
"entity": null,
"notes": null,
"run_id": "b3yvlype",
"mode": null
}
Usage
Use this model as a LeRobot policy checkpoint:
python -m lerobot.scripts.lerobot_eval \
--policy.path=CoRL2026-CSI/smolvla_isaaclab_so101_11task_basecap_3300epi_8ep
For Python loading inside LeRobot code, use the SmolVLA policy loader with this repository id as the pretrained path.
Evaluation and Limitations
This model card reports training checkpoint information only. No rollout success rate or task-level evaluation metric is included in this repository.
The checkpoint assumes a compatible observation/action schema and the camera remapping shown above. The optimizer/RNG training_state files are not included; only the loadable pretrained_model artifact is uploaded.
Provenance
- VLM backbone:
HuggingFaceTB/SmolVLM2-500M-Video-Instruct - Fine-tuning run:
smolvla_20260508_093756 - Source training script:
lerobot/scripts/train_smolvla.sh
- Downloads last month
- 65