YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Some SPECIAL quants of xande-p/Qwen3-19B-Instruct-REAP-ptbr.

See the GGUFs here.

Quantizations:

i_: English+Portuguese+Naval imatrix.
_offloadX: X layers on the CENTER of model are quantized further and with quants fast on CPU. See below.
_oldgpu: Also avoiding quants slow on old cards like GFX906 - MI50 (prefering Q4_0/Q4_1 quants).
No i/offload/oldgpu: Standard llama.cpp quant with NO imatrix.

_offloadX: If the model won't fit in your GPU, you can offload some/all center experts to CPU using -ot. As those are quantized with Q2_K (and some Q3_K), they will be fast on CPU.

Downloads last month: 3

GGUF

Model size

20B params

Architecture

qwen3moe

Hardware compatibility

3-bit

4-bit

6-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support