You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

NVFP4-W4A16 quantized version of LGAI-EXAONE/K-EXAONE-236B-A23B. Only the unshared expert layers are quantized.

Evaluation

The following package versions can be used to evalaute the quantized model.

vllm: 0.15.1

compressed-tensors: 0.13.0

transformers: 5.1.0

lm_eval \
  --model vllm \
  --model_args pretrained="furiosa-ai/K-EXAONE-236B-A23B-NVFP4A16",dtype=auto,gpu_memory_utilization=0.9,tensor_parallel_size=8\
  --tasks mmlu_pro \
  --num_fewshot 5 \
  --batch_size auto \

Accuracy Results

The evaluation was done on H100 PCIE x 8

Benchmark furiosa-ai/K-EXAONE-236B-A23B-NVFP4A16 LGAI-EXAONE/K-EXAONE-236B-A23B (report)
GPQA 71.06 70.60
Downloads last month
35
Safetensors
Model size
138B params
Tensor type
BF16
F32
F8_E4M3
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support