OSCaR LLaVA v1.5 13B Mixed Adapter

This repository contains the adapter artifact staged for the OSCaR public release.

Artifact Type

  • Local staging directory: llava-v1.5-13b-lora-mixed-adapter
  • Public repo id: ali-vosoughi/oscar-llava-v1.5-13b-mixed-adapter
  • Training data condition: OSCaR plus the upstream LLaVA v1.5 mixed visual-instruction manifest (llava_final.json)

Files

  • adapter_model.bin
  • adapter_config.json
  • config.json
  • non_lora_trainables.bin

Loading

This is an adapter-only release. Pass --model-base with the matching Vicuna checkpoint when loading it.

Example:

python -m llava.serve.cli --model-path ali-vosoughi/oscar-llava-v1.5-13b-mixed-adapter --model-base lmsys/vicuna-13b-v1.5 --image-file /path/to/image.jpg

Training Configuration

  • LLaVA v1.5 stack
  • CLIP ViT-L/336 vision tower
  • LoRA rank 128
  • LoRA alpha 256
  • learning rate 2e-4
  • 1 epoch
  • max sequence length 2048

Related Resources

  • Code: https://github.com/nguyennm1024/OSCaR
  • Dataset: https://huggingface.co/datasets/ali-vosoughi/oscar-dataset
  • Paper: https://arxiv.org/abs/2402.17128
Downloads last month
16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ali-vosoughi/oscar-llava-v1.5-13b-mixed-adapter

Adapter
(123)
this model

Paper for ali-vosoughi/oscar-llava-v1.5-13b-mixed-adapter