Commit History

Correct param count (V3 carryover 671B → V4 ~284B); top-K 8→6
60d7028

pastapaul Claude Opus 4.7 (1M context) commited on

License: correct frontmatter to MIT (was apache-2.0)
3899167

pastapaul Claude Opus 4.7 (1M context) commited on

docs: migrate stale refs from pasta-paul → canada-quant
eeedaed

pastapaul Claude Opus 4.7 (1M context) commited on

Sync to in-repo draft: add Spark + RTX PRO 6000 validation, vendored patch, sparse-MLA env var, current canonical recipe
7d6fdde
verified

pastapaul commited on

Credits — note PR #40991 closed and replaced upstream by PR #41834
22da69f

pastapaul commited on

Inference section — TL;DR bootstrap link + updated canonical flags
a7716e2

pastapaul commited on

Phase 4e — Spark long-context table + 1M canonical + quickstart
fbef1f8

pastapaul commited on

Add Phase 5 RTX PRO 6000 Blackwell sm_120 validation column + long-context table + FlashInfer note
d353009
verified

pastapaul commited on

Phase 4d throughput table — long-context graphs-ON sweep on jasl@0789bc9
1e8d6fb

pastapaul commited on

Phase 4c: 64K-context retest validates think-max — 9/10 PASS
8014ac2

pastapaul commited on

docs: add DGX Spark TP=2 validation results, GSM8K 95.37%
93ff023
verified

pastapaul commited on

Update README.md
ef56190
verified

pastapaul commited on

Spark: workspace lock retest on jasl@77bbc16 — same crash, eager remains canonical
ebe6e0f
verified

pastapaul commited on

Add DGX Spark TP=2 (SM 12.1a) deployment recipe + harness/B200 alignment results
c2a51b1
verified

pastapaul commited on

Final benchmarks: GSM8K 92.87%, MMLU 87.27%, HumanEval 54.27% (chat-extract artifact); KV layout probe topology-classified to SM90
7db2f7b
verified

pastapaul commited on

MMLU 5-shot: 87.27% ±0.27% — frontier-grade for a 4-bit weight quantization
c40c0c5
verified

pastapaul commited on

Redact internal framework name from public model card
11cea63
verified

pastapaul commited on

GSM8K 5-shot: 92.87% (flexible-extract) / 42.61% (strict-match) — running on H200 TP=2 vs the live model
640f316
verified

pastapaul commited on

Update README.md
2489f71
verified

pastapaul commited on

Add Phase 3b harness validation table — W4A16-FP8 matches native chat-smoke and beats it by +3 pts on toolcall15
8d95cbd
verified

pastapaul commited on

Rename: AWQ-W4A16 -> W4A16-FP8 (recipe is GPTQ not AWQ; matches RedHat naming)
5069a06
verified

pastapaul commited on

Add model card
29b695a
verified

pastapaul commited on

Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)
2e7ef6a
verified

pastapaul commited on

initial commit
74b5c3f
verified

pastapaul commited on