OSCaR: Object State Captioning and State Change Representation
Paper • 2402.17128 • Published
This repository contains the projector artifact staged for the OSCaR public release.
llava-v1.5-7b-pretrain-projectorali-vosoughi/oscar-llava-v1.5-7b-projectorconfig.jsonmm_projector.binThis is a projector-only release. It is intended for the pretraining and fine-tuning workflow documented in the OSCaR code repository.
Example:
bash scripts/train/pretrain_v1_5_13b_projector.sh
1282562e-42048https://github.com/nguyennm1024/OSCaRhttps://huggingface.co/datasets/ali-vosoughi/oscar-datasethttps://arxiv.org/abs/2402.17128Base model
openai/clip-vit-large-patch14-336