pastapaul Claude Opus 4.7 (1M context) commited on
Commit
f3911aa
·
1 Parent(s): 4c83e4b

Correct BF16 source size (V3 carryover 1.3 TB → V4 ~600 GB)

Browse files

V4-Flash is ~284B params (~13B activated), not the 671B of V3. The 1.3 TB
BF16 source figure was V3-era math (671B × 2 bytes). V4-Flash with the
MTP block is ~600 GB BF16. The 172 GB measured artifact size is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -19,7 +19,7 @@ A DeepSeek-V4-Flash NVFP4-FP8 quantization that retains the MTP (multi-token-pre
19
 
20
  ## What this is
21
 
22
- - 172 GB across 35 safetensors shards (vs 1.3 TB BF16 source).
23
  - Same quantization scheme as `RedHatAI/DeepSeek-V4-Flash-NVFP4-FP8`: NVFP4 (group=16, FP8 e4m3 scales) on routed FFN experts, FP8_BLOCK 128×128 on attention.
24
  - MTP block (`mtp.0.*`, 799 tensors) kept at BF16 — not dropped at load time, not double-quantized when the MTP draft model is constructed.
25
 
 
19
 
20
  ## What this is
21
 
22
+ - 172 GB across 35 safetensors shards (vs ~600 GB BF16 source, MTP block included).
23
  - Same quantization scheme as `RedHatAI/DeepSeek-V4-Flash-NVFP4-FP8`: NVFP4 (group=16, FP8 e4m3 scales) on routed FFN experts, FP8_BLOCK 128×128 on attention.
24
  - MTP block (`mtp.0.*`, 799 tensors) kept at BF16 — not dropped at load time, not double-quantized when the MTP draft model is constructed.
25