Llama-3.2-1B-4B-Quad-MoE-Q4_K_M

Q4_K_M and f16 quantized versions of Fu01978/Llama-3.2-1B-4B-Quad-MoE.

Recommended Settings

  • Temperature: 0.1 (for math/logic), or 0.8 (for creativity)
  • Repeat Penalty: 1.15
Downloads last month
28
GGUF
Model size
4B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Fu01978/Llama-3.2-1B-4B-Quad-MoE-Q4_K_M

Quantized
(3)
this model