Snider Virgil commited on
Commit
25c459c
·
1 Parent(s): 385d6cb

feat: add BF16 + Q8_0 gguf to complete the quant set for paired evals

Browse files

Both converted from model.safetensors via llama.cpp convert_hf_to_gguf.py
(bf16 intermediate reused for Q8_0 quantization via llama-quantize). The
three gguf variants now available on this repo:

lemer-q4_k_m.gguf ~3.2G (recommended, already present)
lemer-q8_0.gguf ~4.6G (higher precision, this commit)
lemer-bf16.gguf ~8.7G (full precision reference, this commit)

Matches the three quant slots in LEM-Eval's targets.yaml for gguf paired
A/B against lthn/lemer's own multi-quant gguf set — base and LEK sides
now both have the same three precision levels so cross-quant comparisons
are apples-to-apples.

Ollama pull paths:
ollama pull hf.co/LetheanNetwork/lemer:Q4_K_M
ollama pull hf.co/LetheanNetwork/lemer:Q8_0
ollama pull hf.co/LetheanNetwork/lemer:BF16

Co-Authored-By: Virgil <virgil@lethean.io>

Files changed (2) hide show
  1. lemer-bf16.gguf +3 -0
  2. lemer-q8_0.gguf +3 -0
lemer-bf16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3995b3581a4b2ce3e4139971e8087f69f380dfdab3fe946f69474625fced7bc2
3
+ size 9311298528
lemer-q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a10021e990768c4437c75b6d74b163e03691be110d585cbb7f52657a5cb52b30
3
+ size 4967490528