YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
NVFP4-W4A16 quantized version of LGAI-EXAONE/K-EXAONE-236B-A23B. Only the unshared expert layers are quantized.
Evaluation
The following package versions can be used to evalaute the quantized model.
vllm: 0.15.1
compressed-tensors: 0.13.0
transformers: 5.1.0
lm_eval \
--model vllm \
--model_args pretrained="furiosa-ai/K-EXAONE-236B-A23B-NVFP4A16",dtype=auto,gpu_memory_utilization=0.9,tensor_parallel_size=8\
--tasks mmlu_pro \
--num_fewshot 5 \
--batch_size auto \
Accuracy Results
The evaluation was done on H100 PCIE x 8
| Benchmark | furiosa-ai/K-EXAONE-236B-A23B-NVFP4A16 | LGAI-EXAONE/K-EXAONE-236B-A23B (report) |
|---|---|---|
| GPQA | 71.06 | 70.60 |
- Downloads last month
- 35
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support