Quantization Details
- Base Model: LGAI-EXAONE/EXAONE-4.0-1.2B
- Method: GPTQ W4A16
- Group Size: 128
- Activation Order: group
- Dampening: 0.01
- Calibration Dataset: LGAI-EXAONE/MANTA-1M (512 samples, max_seq_len=2048)
- KV Cache: FP8 (config.json에 kv_cache_scheme 포함)
- Tool: llmcompressor (GPTQModifier)
Usage
from vllm import LLM
llm = LLM(model="IBDPLab/EXAONE-4.0-1.2B-W4A16-GPTQ-FP8KV")
- Downloads last month
- 34
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support