HRM-Text-1B-GGUF / reports /validation /quantization_report.md
sinimiini's picture
Remove Q4 references from quantization report
d68d508 verified

HRM-Text-1B GGUF Quantization Report

Date: 2026-05-21

Source

  • Source GGUF: fresh\out\gguf\HRM-Text-1B-BF16.gguf
  • Source GGUF SHA256: 2DD5E2EF55E40C46DB0D0CB4CF1427A4E72DA34FEE36F0D2B73D081D0E1C2010
  • Quantization tool: fresh\third_party\llama.cpp\build-hrm-nmake\bin\llama-quantize.exe
  • llama.cpp base commit: 6a257d44633d4a752183ed778b88d2924d0a6b9d
  • Importance matrix: not used

Generated Quantizations

Variant File Size (bytes) SHA256 Validation Upload
BF16 HRM-Text-1B-BF16.gguf 2367995648 2DD5E2EF55E40C46DB0D0CB4CF1427A4E72DA34FEE36F0D2B73D081D0E1C2010 Source of truth Yes
Q8_0 HRM-Text-1B-Q8_0.gguf 1259126560 C0729C267C3421E1F6DE0488AC5448E98EA30E56514DAF210596B70AC3F9786D Pass Yes
Q6_K HRM-Text-1B-Q6_K.gguf 972668704 24D93CA4EF4A02CFE415E3EA56A78AD65198A165A4157B928004B58DBDA2D93C Pass Yes
Q5_K_M HRM-Text-1B-Q5_K_M.gguf 851509024 F6CE71A076EC897174C555D810ED6E379767D52F9396D485B42E42BF8DB1D0B7 Pass Yes

Validation Summary

Each quantized GGUF was compared against the BF16 GGUF with the patched HRM-enabled runtime on four fixed prompts.

Variant Token IDs Top-1 matches Min top-10 overlap hrm_text metadata New loop check Result
Q8_0 Pass 4/4 9/10 Pass Pass Pass
Q6_K Pass 4/4 9/10 Pass Pass Pass
Q5_K_M Pass 4/4 9/10 Pass Pass Pass

Commands

fresh\third_party\llama.cpp\build-hrm-nmake\bin\llama-quantize.exe fresh\out\gguf\HRM-Text-1B-BF16.gguf fresh\out\gguf\HRM-Text-1B-Q8_0.gguf Q8_0 4
fresh\third_party\llama.cpp\build-hrm-nmake\bin\llama-quantize.exe fresh\out\gguf\HRM-Text-1B-BF16.gguf fresh\out\gguf\HRM-Text-1B-Q6_K.gguf Q6_K 4
fresh\third_party\llama.cpp\build-hrm-nmake\bin\llama-quantize.exe fresh\out\gguf\HRM-Text-1B-BF16.gguf fresh\out\gguf\HRM-Text-1B-Q5_K_M.gguf Q5_K_M 4
python fresh\tools\validate_quant_runtime.py --bf16 fresh\out\gguf\HRM-Text-1B-BF16.gguf --quant fresh\out\gguf\HRM-Text-1B-Q8_0.gguf --quant-name Q8_0 --llama-results fresh\third_party\llama.cpp\build-hrm-nmake\bin\llama-results.exe --llama-completion fresh\third_party\llama.cpp\build-hrm-nmake\bin\llama-completion.exe --out fresh\reports\validation\q8_0_vs_bf16.json --work-dir fresh\reports\validation\runtime_tmp\q8_0 --n-generate 32
python fresh\tools\validate_quant_runtime.py --bf16 fresh\out\gguf\HRM-Text-1B-BF16.gguf --quant fresh\out\gguf\HRM-Text-1B-Q6_K.gguf --quant-name Q6_K --llama-results fresh\third_party\llama.cpp\build-hrm-nmake\bin\llama-results.exe --llama-completion fresh\third_party\llama.cpp\build-hrm-nmake\bin\llama-completion.exe --out fresh\reports\validation\q6_k_vs_bf16.json --work-dir fresh\reports\validation\runtime_tmp\q6_k --n-generate 32
python fresh\tools\validate_quant_runtime.py --bf16 fresh\out\gguf\HRM-Text-1B-BF16.gguf --quant fresh\out\gguf\HRM-Text-1B-Q5_K_M.gguf --quant-name Q5_K_M --llama-results fresh\third_party\llama.cpp\build-hrm-nmake\bin\llama-results.exe --llama-completion fresh\third_party\llama.cpp\build-hrm-nmake\bin\llama-completion.exe --out fresh\reports\validation\q5_k_m_vs_bf16.json --work-dir fresh\reports\validation\runtime_tmp\q5_k_m --n-generate 32