Static quantization of Sapphira-L3.3-70b-0.1

File Notes
PART 1
PART 2
Q6_K with token embedding, output, and some other tensors quantized to Q8_0
6.70 bpw
~2.1% increase in size relative to Q6_K
Quantized from BF16
Downloads last month
1
GGUF
Model size
71B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Valeciela/Sapphira-L3.3-70b-0.1-Q6_K_L-GGUF

Quantized
(8)
this model