feat: add BF16 + Q8_0 gguf to complete the quant set for paired evals
Browse filesBoth converted from model.safetensors via llama.cpp convert_hf_to_gguf.py
(bf16 intermediate reused for Q8_0 quantization via llama-quantize). The
three gguf variants now available on this repo:
lemer-q4_k_m.gguf ~3.2G (recommended, already present)
lemer-q8_0.gguf ~4.6G (higher precision, this commit)
lemer-bf16.gguf ~8.7G (full precision reference, this commit)
Matches the three quant slots in LEM-Eval's targets.yaml for gguf paired
A/B against lthn/lemer's own multi-quant gguf set — base and LEK sides
now both have the same three precision levels so cross-quant comparisons
are apples-to-apples.
Ollama pull paths:
ollama pull hf.co/LetheanNetwork/lemer:Q4_K_M
ollama pull hf.co/LetheanNetwork/lemer:Q8_0
ollama pull hf.co/LetheanNetwork/lemer:BF16
Co-Authored-By: Virgil <virgil@lethean.io>
- lemer-bf16.gguf +3 -0
- lemer-q8_0.gguf +3 -0
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3995b3581a4b2ce3e4139971e8087f69f380dfdab3fe946f69474625fced7bc2
|
| 3 |
+
size 9311298528
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a10021e990768c4437c75b6d74b163e03691be110d585cbb7f52657a5cb52b30
|
| 3 |
+
size 4967490528
|