Safetensors
exaone4
compressed-tensors

Quantization Details

  • Base Model: LGAI-EXAONE/EXAONE-4.0-1.2B
  • Method: GPTQ W4A16
  • Group Size: 128
  • Activation Order: group
  • Dampening: 0.01
  • Calibration Dataset: LGAI-EXAONE/MANTA-1M (512 samples, max_seq_len=2048)
  • KV Cache: FP8 (config.json에 kv_cache_scheme 포함)
  • Tool: llmcompressor (GPTQModifier)

Usage

from vllm import LLM

llm = LLM(model="IBDPLab/EXAONE-4.0-1.2B-W4A16-GPTQ-FP8KV")
Downloads last month
34
Safetensors
Model size
2B params
Tensor type
I64
·
I32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support