EXAONE-4.0-1.2B QQQ W4

4-bit quantized LGAI-EXAONE/EXAONE-4.0-1.2B

Config: QQQ | 4-bit | group=128 | QQQ

W4A8: 4-bit weights, 8-bit activations - Marlin-based kernel

Usage

from gptqmodel import GPTQModel
model = GPTQModel.from_quantized("YOUR_REPO", device="cuda:0")
Downloads last month
1
Safetensors
Model size
0.4B params
Tensor type
BF16
·
I32
·
F16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for namgyu-youn/exaone-1.2b-qqq-w4a8

Quantized
(33)
this model