Vittle-7B (F) weight

Vittle (F) is the fixed-prior variant of Vittle (NeurIPS 2025), a VLM instruction tuning framework that improves robustness to distribution shifts via variational information bottleneck.

Usage

import torch
from vittle.model.language_model.vittle_llama import VittleLlamaForCausalLM
from transformers import AutoTokenizer

model = VittleLlamaForCausalLM.from_pretrained(
    "changdae/vittle-7b-F",
    torch_dtype=torch.bfloat16,
    device_map="cuda:0",
)
tokenizer = AutoTokenizer.from_pretrained("changdae/vittle-7b-F", use_fast=False)

Refer to the evaluation guide for full inference instructions.

Model Details

Property Value
Base model lmsys/vicuna-7b-v1.5
Vision encoder openai/clip-vit-large-patch14-336
Bottleneck layer 24
Interpolation coefficient (alpha) 0.5
KLD strength (beta) 0.1
Learnable prior No (fixed)
Training dtype bfloat16

Citation

@inproceedings{
  oh2025visual,
  title={Visual Instruction Bottleneck Tuning},
  author={Changdae Oh and Jiatong Li and Shawn Im and Sharon Li},
  booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
  year={2025},
  url={https://openreview.net/forum?id=yzHiEmLSk8}
}
Downloads last month
44
Safetensors
Model size
7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for changdae/vittle-7b-F

Finetuned
(67)
this model

Dataset used to train changdae/vittle-7b-F

Paper for changdae/vittle-7b-F