mattbucci commited on
Commit
8166241
·
verified ·
1 Parent(s): e363ec5

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -18,6 +18,8 @@ license: apache-2.0
18
 
19
  # Qwen3-Coder-30B-A3B-REAP AWQ 4-bit
20
 
 
 
21
  AWQ 4-bit quantization of a self-built REAP-pruned variant of [Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct), calibrated with thinking + code data, optimized for AMD RDNA4 (gfx1201) inference with [SGLang](https://github.com/sgl-project/sglang).
22
 
23
  ## Model Details
 
18
 
19
  # Qwen3-Coder-30B-A3B-REAP AWQ 4-bit
20
 
21
+ > ⚠️ **VALIDATION FAILED 2026-04-29 — DO NOT USE.** Initial smoke test on the AWQ output produces gibberish (`def count_vowels(s):` → `sweat sweat aster aster…`, both `/v1/chat/completions` and `/v1/completions`). The end-to-end pipeline (REAP+REAM merge → AWQ calib → CT→native conversion → audit) all reported success at the file-format level, but the resulting weights are unusable for inference. Likely root cause: insufficient AWQ calibration coverage (256 samples × 1024 tokens) for a 96-expert MoE post-merge, or weight corruption introduced by the REAP+REAM merge step. **For a working REAP variant of this base, use [`mattbucci/Qwen3-Coder-REAP-25B-A3B-AWQ`](https://huggingface.co/mattbucci/Qwen3-Coder-REAP-25B-A3B-AWQ)** (Cerebras prune, validated, 88/300 on SWE-bench Lite). This repo will be either fixed or removed; tracking under task #52 in the [RDNA4 inference repo](https://github.com/mattbucci/2x-R9700-RDNA4-GFX1201-sglang-inference).
22
+
23
  AWQ 4-bit quantization of a self-built REAP-pruned variant of [Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct), calibrated with thinking + code data, optimized for AMD RDNA4 (gfx1201) inference with [SGLang](https://github.com/sgl-project/sglang).
24
 
25
  ## Model Details