LaViDa-LLaDa v1.0 Instruct (Transformers-Compatible)

[Github][Paper] [Arxiv] [Checkpoints] [Data] [Website]

This is a transformers-compatible version of the LaViDa-LLaDa checkpoint. It allows direct loading via Huggingface transformers APIs for easier inference and integration.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained('./lavida-llada-v1.0-instruct/')
model = AutoModelForCausalLM.from_pretrained('./lavida-llada-v1.0-instruct/', torch_dtype=torch.bfloat16)
image_processor = model.get_vision_tower().image_processor

model.resize_token_embeddings(len(tokenizer))
model.tie_weights()

License

Apache 2.0

Downloads last month: 17,326

Safetensors

Model size

8B params

Tensor type

BF16

Paper for KonstantinosKK/lavida-llada-v1.0-instruct-hf-transformers

LaViDa: A Large Diffusion Language Model for Multimodal Understanding

Paper • 2505.16839 • Published May 22, 2025 • 13