SFT-Robo2: beat_block_hammer

OpenVLA-OFT SFT checkpoint for the beat_block_hammer task from RoboTwin 2.0, trained to match the SimpleVLA-RL paper (arXiv:2509.09674) settings.

Model Details

  • Base model: openvla/openvla-7b
  • Fine-tuning method: LoRA (rank 32) on LLaMA2-7B backbone
  • Training objective: Cross-entropy on discrete action tokens (NOT L1 regression)
  • Architecture: Single-view image + proprioception + language instruction -> 25 action chunks x 14D (bimanual ALOHA)

Training Config

Parameter Value
max_steps 10,000
batch_size 8 per GPU x 2 GPUs = 16 global
learning_rate 5e-4
lora_rank 32
num_images_in_input 1 (head camera only)
use_l1_regression False
use_film False
use_proprio True
use_diffusion False
image_aug True
NUM_ACTIONS_CHUNK 25
ACTION_DIM 14
ACTION_PROPRIO_NORMALIZATION_TYPE bounds

Training Data

  • 1000 expert demonstrations from RoboTwin 2.0 beat_block_hammer task
  • Collected with curobo motion planner under full domain randomization
  • 950 train / 50 val split (5% validation)
  • Expert success rate: ~70% (filtered to successful trajectories only)

Evaluation Results

Metric Value
Success rate (seed 0, 100 episodes) 45.0%
Success rate (seed 1, partial 16 episodes) 50.0%
Paper SFT baseline (Table 4) 28.1%
Prior Phase 1 reproduction 35.2%

Evaluated with greedy sampling (do_sample=False), 100 held-out scenarios, demo_randomized config.

Compatibility

This checkpoint is compatible with:

  • SimpleVLA-RL RL training script (run_openvla_oft_rl_twin2.sh)
  • RoboTwin 2.0 evaluation suite (policy/openvla-oft/eval.sh)

Important: This checkpoint uses cross-entropy discrete action tokens (LLaMA2 head), NOT L1 regression with MLP action head. Ensure your inference code passes use_l1_regression=False and passes proprioceptive state to predict_action.

Training Infrastructure

  • Hardware: 2x NVIDIA H200 (141GB each)
  • Training time: 5.4 hours
  • Platform: UNSW Katana HPC (PBSPro scheduler)

Citation

If you use this checkpoint, please cite the SimpleVLA-RL paper: arXiv:2509.09674

License

MIT

Downloads last month
20
Safetensors
Model size
8B params
Tensor type
BF16
·
Video Preview
loading

Collection including Louisnguyen/sft-robo2-beat_block_hammer

Paper for Louisnguyen/sft-robo2-beat_block_hammer