EXAONE-4.0-1.2B GPTQ W3 + EoRA

3-bit: mse=2.0 + group_size=32 + SmoothMSE(64,0.70) + EoRA(rank=96)

Expected: 94-96% quality × 3.5-5.0x = 3.29-4.80 score

Usage

GPTQModel

from gptqmodel import GPTQModel
model = GPTQModel.from_quantized("namgyu-youn/EXAONE-4.0-1.2B-GPTQ-W3A16-EoRA", device="cuda:0")

vLLM

from vllm import LLM
llm = LLM(model="namgyu-youn/EXAONE-4.0-1.2B-GPTQ-W3A16-EoRA", dtype="float16")
Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match 0.6621 ± 0.0209
strict-match 5 exact_match 0.6562 ± 0.0210
Downloads last month
1
Safetensors
Model size
1B params
Tensor type
F16
·
I32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for namgyu-youn/EXAONE-4.0-1.2B-GPTQ-W3A16-EoRA

Quantized
(33)
this model

Collection including namgyu-youn/EXAONE-4.0-1.2B-GPTQ-W3A16-EoRA