YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Qwen3-VL-2B-Instruct-GPTQ-Int4
GPTQ-Int4 quantized version of Qwen/Qwen3-VL-2B-Instruct.
Quantization Details
- Method: GPTQ 4-bit with group_size=128
- Tool: GPTQModel 6.0.3
- Calibration: 256 samples with random images
- Base model: Qwen/Qwen3-VL-2B-Instruct
- Model size: ~2.17 GB (vs ~4.26 GB unquantized)
Usage
from transformers import AutoModelForImageTextToText, AutoProcessor
model = AutoModelForImageTextToText.from_pretrained(
"h2oai/Qwen3-VL-2B-Instruct-GPTQ-Int4",
torch_dtype="auto",
device_map="auto",
)
processor = AutoProcessor.from_pretrained("h2oai/Qwen3-VL-2B-Instruct-GPTQ-Int4")
messages = [
{"role": "user", "content": [
{"type": "image", "image": "https://example.com/image.png"},
{"type": "text", "text": "Describe this image."},
]}
]
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=[text], images=[image], return_tensors="pt", padding=True).to(model.device)
output = model.generate(**inputs, max_new_tokens=128)
print(processor.batch_decode(output, skip_special_tokens=True))
License
Same as the base model: Apache-2.0
- Downloads last month
- 271
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support