sinimiini commited on
Commit
143f0e6
·
verified ·
1 Parent(s): 7d2ba56

Remove Q4 references from README and delete Q4 report

Browse files
Files changed (1) hide show
  1. README.md +0 -5
README.md CHANGED
@@ -70,7 +70,6 @@ Only the normal causal generation path is implemented in the patched runtime. Pr
70
  | `reports/validation/q8_0_vs_bf16.json` | `Q8_0` vs BF16 runtime validation |
71
  | `reports/validation/q6_k_vs_bf16.json` | `Q6_K` vs BF16 runtime validation |
72
  | `reports/validation/q5_k_m_vs_bf16.json` | `Q5_K_M` vs BF16 runtime validation |
73
- | `reports/validation/q4_k_m_vs_bf16.json` | Failed `Q4_K_M` validation report; the `Q4_K_M` GGUF is not uploaded |
74
 
75
  ## Provenance
76
 
@@ -92,8 +91,6 @@ Only the normal causal generation path is implemented in the patched runtime. Pr
92
  | Q6_K | `HRM-Text-1B-Q6_K.gguf` | `972668704` | `24D93CA4EF4A02CFE415E3EA56A78AD65198A165A4157B928004B58DBDA2D93C` |
93
  | Q5_K_M | `HRM-Text-1B-Q5_K_M.gguf` | `851509024` | `F6CE71A076EC897174C555D810ED6E379767D52F9396D485B42E42BF8DB1D0B7` |
94
 
95
- `Q4_K_M` was generated and tested locally but is not uploaded. It introduced a new single-token repetition loop for one validation prompt, so it failed the release gate.
96
-
97
  ## Validation Summary
98
 
99
  Validation was performed from a clean source snapshot and a clean `llama.cpp` base checkout.
@@ -114,7 +111,6 @@ Quantized variants were validated against the BF16 GGUF:
114
  | Q8_0 | Pass | `4/4` | `9/10` | Pass | Pass |
115
  | Q6_K | Pass | `4/4` | `9/10` | Pass | Pass |
116
  | Q5_K_M | Pass | `4/4` | `9/10` | Pass | Pass |
117
- | Q4_K_M | Pass | `3/4` | `8/10` | Fail | Not uploaded |
118
 
119
  Full-vocab mean absolute logit error:
120
 
@@ -160,7 +156,6 @@ Depending on the generator binary and `llama.cpp` build type, the executable may
160
  - `hrm_text` is a custom GGUF architecture in this conversion.
161
  - Generic GGUF runners will not work until they implement the HRM runtime graph.
162
  - Prefix-LM bidirectional attention with `token_type_ids` is not implemented in the patched `llama.cpp` path.
163
- - `Q4_K_M` is intentionally not included because strict validation found a new single-token repetition loop.
164
 
165
  ## License
166
 
 
70
  | `reports/validation/q8_0_vs_bf16.json` | `Q8_0` vs BF16 runtime validation |
71
  | `reports/validation/q6_k_vs_bf16.json` | `Q6_K` vs BF16 runtime validation |
72
  | `reports/validation/q5_k_m_vs_bf16.json` | `Q5_K_M` vs BF16 runtime validation |
 
73
 
74
  ## Provenance
75
 
 
91
  | Q6_K | `HRM-Text-1B-Q6_K.gguf` | `972668704` | `24D93CA4EF4A02CFE415E3EA56A78AD65198A165A4157B928004B58DBDA2D93C` |
92
  | Q5_K_M | `HRM-Text-1B-Q5_K_M.gguf` | `851509024` | `F6CE71A076EC897174C555D810ED6E379767D52F9396D485B42E42BF8DB1D0B7` |
93
 
 
 
94
  ## Validation Summary
95
 
96
  Validation was performed from a clean source snapshot and a clean `llama.cpp` base checkout.
 
111
  | Q8_0 | Pass | `4/4` | `9/10` | Pass | Pass |
112
  | Q6_K | Pass | `4/4` | `9/10` | Pass | Pass |
113
  | Q5_K_M | Pass | `4/4` | `9/10` | Pass | Pass |
 
114
 
115
  Full-vocab mean absolute logit error:
116
 
 
156
  - `hrm_text` is a custom GGUF architecture in this conversion.
157
  - Generic GGUF runners will not work until they implement the HRM runtime graph.
158
  - Prefix-LM bidirectional attention with `token_type_ids` is not implemented in the patched `llama.cpp` path.
 
159
 
160
  ## License
161