You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

NVFP4-W4A16 quantized version of LGAI-EXAONE/K-EXAONE-236B-A23B. Only the unshared expert layers are quantized.

Evaluation

The following package versions can be used to evalaute the quantized model.

vllm: 0.15.1

compressed-tensors: 0.13.0

transformers: 5.1.0

lm_eval \
  --model vllm \
  --model_args pretrained="furiosa-ai/K-EXAONE-236B-A23B-NVFP4A16",dtype=auto,gpu_memory_utilization=0.9,tensor_parallel_size=8\
  --tasks mmlu_pro \
  --num_fewshot 5 \
  --batch_size auto \

Accuracy Results

The evaluation was done on H100 PCIE x 8

Benchmark	furiosa-ai/K-EXAONE-236B-A23B-NVFP4A16	LGAI-EXAONE/K-EXAONE-236B-A23B (report)
GPQA	71.06	70.60

Downloads last month: 35

Safetensors

Model size

138B params

Tensor type

BF16

F32

F8_E4M3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support