Q4F Q8A: Q4_K ffn, Q8_0 attn, Q8_0 output, Q8_0 embeds

Fits ≥24K Q8 CTX on a 24GiB GPU

Downloads last month
4
GGUF
Model size
33B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Beinsezii/Qwen3-32B-Q4F-Q8A-GGUF

Base model

Qwen/Qwen3-32B
Quantized
(144)
this model

Collection including Beinsezii/Qwen3-32B-Q4F-Q8A-GGUF