Fu01978
/

Llama-3.2-1B-4B-Quad-MoE-Q4_K_M

Text Generation

mixture-of-experts

low-power-inference

Model card Files Files and versions

Llama-3.2-1B-4B-Quad-MoE-Q4_K_M

Q4_K_M and f16 quantized versions of Fu01978/Llama-3.2-1B-4B-Quad-MoE.

Recommended Settings

Temperature: 0.1 (for math/logic), or 0.8 (for creativity)
Repeat Penalty: 1.15

Downloads last month: 28

GGUF

Model size

4B params

Architecture

llama

Hardware compatibility

Log In to add your hardware

4-bit

16-bit

Model tree for Fu01978/Llama-3.2-1B-4B-Quad-MoE-Q4_K_M

Base model

Fu01978/Llama-3.2-1B-4B-Quad-MoE

Quantized

(3)

this model