Qwen3-VL-8B-Instruct-heretic-gguf

Overview

This repository provides a GGUF quantized builds of Qwen3-VL-8B-Instruct-heretic for llama.cpp.

This model is a decensored derivative of the official Qwen/Qwen3-VL-8B-Instruct,
modified using Heretic v1.1.0.

Quantization Details

  • Backend: llama.cpp
  • Commit: 7537 (e68c19b0f)
  • Method: Q4_K_M for weights, FP16 for multimodal adapter
  • Imatrix Optimization: [x]️ (custom dataset)
Downloads last month
93
GGUF
Model size
8B params
Architecture
qwen3vl
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for SergiusFlavius/Qwen3-VL-8B-Instruct-heretic-gguf

Quantized
(2)
this model