Massive MoE models ≥100B quantized with HLWQ · consumer deploy via vLLM expert offload
-
caiovicentino1/MiniMax-M2.7-HLWQ-Q5
Text Generation • Updated • 284 • 3 -
caiovicentino1/Qwopus-MoE-35B-A3B-HLWQ-Q5
Text Generation • 35B • Updated • 1.61k • 4 -
caiovicentino1/Nemotron-Cascade-2-30B-A3B-HLWQ-Q5
Text Generation • 20B • Updated • 2.55k • 7 -
caiovicentino1/Gemopus-4-26B-A4B-it-HLWQ-Q5
Image-Text-to-Text • Updated • 103