GGUF File format for llama-2-chat-13b models from Meta AI.

Quantization:

Currently only 2 quants are available in my repository:

filename	quantization	size
ggml-llama-2-13b-chat-q4_k_m.gguf	Q4_K_M	7.8GB
ggml-llama-2-13b-chat-f16.gguf	f16	26GB

License subject to Meta's original license agreement.

GGUF

Model size

13B params

Architecture

llama

Hardware compatibility

4-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support