YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
MiniVLA Fine-tuned on LAMPE 4-DoF Dataset
This repository contains a fine-tuned MiniVLA model trained on the LAMPE dataset with 4-DoF actions (Base, Joint2, Joint3, Joint4).
Model Details
- Base Model:
openvla/openvla-7b - Fine-tuning Method: LoRA (Low-Rank Adaptation)
- LoRA Rank: 32
- LoRA Dropout: 0.0
- Training Dataset: LAMPE Combined Dataset (80 trajectories, 6,289 transitions)
- Action Space: 4-DoF [Base, Joint2, Joint3, Joint4]
- Training Steps: 3,000
- Batch Size: 8
- Learning Rate: 5e-4
- Image Augmentation: Enabled
Quick Start
Installation
# Install dependencies
pip install torch torchvision transformers peft accelerate
pip install huggingface_hub
pip install git+https://github.com/moojink/dlimp_openvla
Loading the Model
from prismatic.models.load import load_vla
import torch
# Load the base model
model = load_vla(
"openvla/openvla-7b",
load_for_training=False,
)
# Load fine-tuned checkpoint
checkpoint = torch.load("checkpoints/step-003000-loss=0.9050.pt", map_location="cpu")
if "model_state_dict" in checkpoint:
model.load_state_dict(checkpoint["model_state_dict"], strict=False)
# Set to evaluation mode
model.eval()
Inference
from PIL import Image
# Load image and instruction
image = Image.open("path/to/image.jpg")
instruction = "turn left"
# Predict action
with torch.inference_mode():
action = model.predict_action(
image=image,
instruction=instruction,
unnorm_key="lampe_dataset_combined",
do_sample=False
)
print(f"Predicted action: {action}")
# Output: [Base, Joint2, Joint3, Joint4]
Fine-tuning Scripts
This repository includes the fine-tuning script (finetune.py) used to train this model.
Running Fine-tuning
python finetune.py \
--vla_path "openvla/openvla-7b" \
--data_root_dir "/path/to/rlds_dataset" \
--dataset_name "lampe_dataset_combined" \
--dataset_statistics_path "dataset_statistics.json" \
--batch_size 8 \
--max_steps 3000 \
--learning_rate 5e-4 \
--lora_rank 32 \
--lora_dropout 0.0 \
--wandb_mode "offline"
Or use the provided shell script:
bash finetune_lampe_combined.sh
Validation
Use the validate.py script to validate the model on sample data:
python validate.py
Update the paths in validate.py:
CHECKPOINT_DIR: Path to checkpoint directorySAMPLE_PATH: Path to sample data directory
Dataset Statistics
The dataset_statistics.json file contains normalization statistics for the LAMPE dataset:
{
"lampe_dataset_combined": {
"action": {
"mean": [...],
"std": [...],
"min": [...],
"max": [...],
"q01": [...],
"q99": [...]
},
"proprio": {...},
"num_transitions": 6289,
"num_trajectories": 80
}
}
Model Architecture
- Vision Backbone: DinoSigLIP (ViT-SO400M-14-SigLIP)
- LLM Backbone: Qwen2.5-0.5B with extra tokens
- Action Tokenizer: Standard ActionTokenizer (256 bins, 4-DoF)
- Image Resolution: 224x224
Key Features
- 4-DoF Action Space: Supports Base, Joint2, Joint3, and Joint4 control
- LoRA Fine-tuning: Parameter-efficient fine-tuning with only 1.39% trainable parameters
- Custom RLDS Dataset Support: Handles custom RLDS datasets not in OXE
- FlashAttention2 Disabled: Optimized for 0.5B model without FlashAttention2
- VQ Tokenizer Replacement: Automatically replaces VQ tokenizers with standard ones
Files Included
finetune.py: Fine-tuning script with custom dataset supportvalidate.py: Validation script for model evaluationfinetune_lampe_combined.sh: Shell script for easy fine-tuningdataset_statistics.json: Dataset normalization statisticscheckpoints/: Model checkpoints from trainingadapter-weights/: LoRA adapter weights (if available)
Citation
If you use this model, please cite:
@misc{minivla-lampe-4dof,
title={MiniVLA Fine-tuned on LAMPE 4-DoF Dataset},
author={Your Name},
year={2025},
url={https://huggingface.co/kavinrajkrupsurge/openvla-lampe-4dof-finetuned}
}
License
This model follows the license of the base model openvla/openvla-7b.
Contact
For questions or issues, please open an issue on the HuggingFace repository.
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support