These are quantizations of the model ZwZ-4B, using a imatrix created from text_en_medium

Usage Notes:

Download the latest llama.cpp to use these quantizations.
Try to use the best quality you can run.
For the mmproj file, the F32 version is recommended for best results (F32 > BF16 > F16).

GGUF

Model size

4B params

Architecture

qwen3vl

Hardware compatibility

4-bit

5-bit

6-bit

8-bit

16-bit

Model tree for noctrex/ZwZ-4B-GGUF

Base model

Finetuned

Quantized

(6)

this model