This is a MXFP4_MOE quantization of the model LFM2-24B-A2B

--temp 0.1
--top-k 50
--repeat-penalty 1.05

The mainline standard is to use MXFP4 for the MoE tensors, and Q8 for the rest.
So I created a new variant, where the other tensors are BF16 instead of Q8.
On some architectures BF16 will be slower, but its the highest quality, essentialy its the original tensors from the model copied over unquantized.

Downloads last month: 138

GGUF

Model size

24B params

Architecture

lfm2moe

Hardware compatibility

4-bit

Model tree for noctrex/LFM2-24B-A2B-MXFP4_MOE-GGUF

Base model

LiquidAI/LFM2-24B-A2B

Quantized

(23)

this model