Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
canada-quant
/
DeepSeek-V4-Flash-W4A16-FP8
like
14
Follow
Canada Quant Labs
3
Safetensors
English
Chinese
vllm
deepseek_v4
mixture-of-experts
Mixture of Experts
compressed-tensors
w4a16
gptq
fp8-block
deepseek
deepseek-v4
License:
mit
Model card
Files
Files and versions
xet
Community
4
Copy to bucket
new
main
DeepSeek-V4-Flash-W4A16-FP8
152 GB
Ctrl+K
Ctrl+K
2 contributors
History:
24 commits
pastapaul
Claude Opus 4.7 (1M context)
Correct param count (V3 carryover 671B → V4 ~284B); top-K 8→6
60d7028
2 days ago
.gitattributes
Safe
1.52 kB
initial commit
20 days ago
README.md
17.4 kB
Correct param count (V3 carryover 671B → V4 ~284B); top-K 8→6
2 days ago
config.json
12.2 kB
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
20 days ago
generation_config.json
Safe
174 Bytes
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
20 days ago
model-00001-of-00004.safetensors
Safe
50 GB
xet
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
20 days ago
model-00002-of-00004.safetensors
Safe
50 GB
xet
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
20 days ago
model-00003-of-00004.safetensors
Safe
50 GB
xet
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
20 days ago
model-00004-of-00004.safetensors
Safe
2.48 GB
xet
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
20 days ago
model.safetensors.index.json
8.51 MB
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
20 days ago
recipe.yaml
Safe
1.97 kB
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
20 days ago
tokenizer.json
Safe
10.1 MB
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
20 days ago
tokenizer_config.json
Safe
397 Bytes
Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
20 days ago