Double the size of the original?

#3
by pathosethoslogos - opened

The original is 160 GB https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash/tree/main

Why is this one 294 GB?

FP8 bigger than FP4, figured it out yet?

Oh didn't realise the original was 4 bit quant.

https://docs.sglang.io/cookbook/autoregressive/DeepSeek/DeepSeek-V4

they recommend this fp8 version for the H200 hardware architecture. that is all.

Sign up or log in to comment