EXAONE-4.0-1.2B QQQ W4
4-bit quantized LGAI-EXAONE/EXAONE-4.0-1.2B
Config: QQQ | 4-bit | group=128 | QQQ
W4A8: 4-bit weights, 8-bit activations - Marlin-based kernel
Usage
from gptqmodel import GPTQModel
model = GPTQModel.from_quantized("YOUR_REPO", device="cuda:0")
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for namgyu-youn/exaone-1.2b-qqq-w4a8
Base model
LGAI-EXAONE/EXAONE-4.0-1.2B