YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Some SPECIAL quants of xande-p/Qwen3-19B-Instruct-REAP-ptbr.

See the GGUFs here.

Quantizations:

  • i_: English+Portuguese+Naval imatrix.
  • _offloadX: X layers on the CENTER of model are quantized further and with quants fast on CPU. See below.
  • _oldgpu: Also avoiding quants slow on old cards like GFX906 - MI50 (prefering Q4_0/Q4_1 quants).
  • No i/offload/oldgpu: Standard llama.cpp quant with NO imatrix.

_offloadX: If the model won't fit in your GPU, you can offload some/all center experts to CPU using -ot. As those are quantized with Q2_K (and some Q3_K), they will be fast on CPU.

Downloads last month
3
GGUF
Model size
20B params
Architecture
qwen3moe
Hardware compatibility
Log In to add your hardware

3-bit

4-bit

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support