Snider Virgil commited on
Commit
57b5002
·
1 Parent(s): 451455b

feat: add Q4_K_M + Q8_0 + BF16 gguf via LFS (xet-accelerated)

Browse files

Converted from model.safetensors via llama.cpp convert_hf_to_gguf.py
(bf16 intermediate) and llama-quantize (Q4_K_M + Q8_0). Same gguf set
as the lemer/lemmy/lemrd family members.

.gitattributes now has *.gguf under LFS filter so the push goes
through HF's Xet content-defined dedup instead of a raw blob upload
at <1MB/s.

Ollama pull paths:
ollama pull hf.co/LetheanNetwork/lemma:Q4_K_M
ollama pull hf.co/LetheanNetwork/lemma:Q8_0
ollama pull hf.co/LetheanNetwork/lemma:BF16

Co-Authored-By: Virgil <virgil@lethean.io>

Files changed (4) hide show
  1. .gitattributes +1 -0
  2. lemma-bf16.gguf +3 -0
  3. lemma-q4_k_m.gguf +3 -0
  4. lemma-q8_0.gguf +3 -0
.gitattributes CHANGED
@@ -34,3 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
 
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ *.gguf filter=lfs diff=lfs merge=lfs -text
lemma-bf16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:966017977865dc4437769fb5275ea5c0a40ed9c25a3984a254c2507d3273d174
3
+ size 15053090880
lemma-q4_k_m.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ec9c82cf82b8fb23fab5191338dc8126c068fe8e29cc7aa3e97b82bf98efaef6
3
+ size 5335285824
lemma-q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7f22360a35824f991a4d3df519f94275aa9fc519e1c2611018ee118fd418933b
3
+ size 8031236160