Static quantization of Sapphira-L3.3-70b-0.1

File	Notes
PART 1 PART 2	Q6_K with token embedding, output, and some other tensors quantized to Q8_0 6.70 bpw ~2.1% increase in size relative to Q6_K Quantized from BF16

GGUF

Model size

71B params

Architecture

llama

Hardware compatibility

6-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Valeciela/Sapphira-L3.3-70b-0.1-Q6_K_L-GGUF

Base model

Quantized

(8)

this model