Model Summary

This repository hosts quantized versions of the TinyLlama Chat v1.0 model.

Format: GGUF
Converter: llama.cpp d39e26741f9f02340651dbc640c9776e1a1128ef
Quantizer: LM-Kit.NET 2024.9.3

For more detailed information on the base model, please visit the following link

Downloads last month
16
GGUF
Model size
1B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support