Quantized EXAONE4
Collection
Quantized checkpoints for EXAONE4 series • 5 items • Updated
3-bit: mse=2.0 + group_size=32 + SmoothMSE(64,0.70)
Expected: 90-93% quality × 3.0-4.0x speed
from gptqmodel import GPTQModel
model = GPTQModel.from_quantized("namgyu-youn/EXAONE-4.0-1.2B-GPTQ-W3A16", device="cuda:0")
from vllm import LLM
llm = LLM(model="namgyu-youn/EXAONE-4.0-1.2B-GPTQ-W3A16", dtype="float16")
Base model
LGAI-EXAONE/EXAONE-4.0-1.2B