canada-quant
/

DeepSeek-V4-Flash-W4A16-FP8

mixture-of-experts

Mixture of Experts

compressed-tensors

Model card Files Files and versions

DeepSeek-V4-Flash-W4A16-FP8

Commit History

Correct param count (V3 carryover 671B → V4 ~284B); top-K 8→6

60d7028

pastapaul Claude Opus 4.7 (1M context) commited on 2 days ago

License: correct frontmatter to MIT (was apache-2.0)

3899167

pastapaul Claude Opus 4.7 (1M context) commited on 2 days ago

docs: migrate stale refs from pasta-paul → canada-quant

eeedaed

pastapaul Claude Opus 4.7 (1M context) commited on 3 days ago

Sync to in-repo draft: add Spark + RTX PRO 6000 validation, vendored patch, sparse-MLA env var, current canonical recipe

7d6fdde
verified

pastapaul commited on 7 days ago

Credits — note PR #40991 closed and replaced upstream by PR #41834

22da69f

pastapaul commited on 17 days ago

Inference section — TL;DR bootstrap link + updated canonical flags

a7716e2

pastapaul commited on 17 days ago

Phase 4e — Spark long-context table + 1M canonical + quickstart

fbef1f8

pastapaul commited on 17 days ago

Add Phase 5 RTX PRO 6000 Blackwell sm_120 validation column + long-context table + FlashInfer note

d353009
verified

pastapaul commited on 18 days ago

Phase 4d throughput table — long-context graphs-ON sweep on jasl@0789bc9

1e8d6fb

pastapaul commited on 18 days ago

Phase 4c: 64K-context retest validates think-max — 9/10 PASS

8014ac2

pastapaul commited on 18 days ago

docs: add DGX Spark TP=2 validation results, GSM8K 95.37%

93ff023
verified

pastapaul commited on 19 days ago

Update README.md

ef56190
verified

pastapaul commited on 19 days ago

Spark: workspace lock retest on jasl@77bbc16 — same crash, eager remains canonical

ebe6e0f
verified

pastapaul commited on 19 days ago

Add DGX Spark TP=2 (SM 12.1a) deployment recipe + harness/B200 alignment results

c2a51b1
verified

pastapaul commited on 19 days ago

Final benchmarks: GSM8K 92.87%, MMLU 87.27%, HumanEval 54.27% (chat-extract artifact); KV layout probe topology-classified to SM90

7db2f7b
verified

pastapaul commited on 20 days ago

MMLU 5-shot: 87.27% ±0.27% — frontier-grade for a 4-bit weight quantization

c40c0c5
verified

pastapaul commited on 20 days ago

Redact internal framework name from public model card

11cea63
verified

pastapaul commited on 20 days ago

GSM8K 5-shot: 92.87% (flexible-extract) / 42.61% (strict-match) — running on H200 TP=2 vs the live model

640f316
verified

pastapaul commited on 20 days ago

Update README.md

2489f71
verified

pastapaul commited on 20 days ago

Add Phase 3b harness validation table — W4A16-FP8 matches native chat-smoke and beats it by +3 pts on toolcall15

8d95cbd
verified

pastapaul commited on 20 days ago

Rename: AWQ-W4A16 -> W4A16-FP8 (recipe is GPTQ not AWQ; matches RedHat naming)

5069a06
verified

pastapaul commited on 20 days ago

Add model card

29b695a
verified

pastapaul commited on 20 days ago

Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)

2e7ef6a
verified

pastapaul commited on 20 days ago

initial commit

74b5c3f
verified

pastapaul commited on 20 days ago