tsunemoto
/

TinyLlama-1.1B-Chat-v0.6-x8-MoE-GGUF

Model card Files Files and versions

Tsunemoto GGUF's of TinyLlama-1.1B-Chat-v0.6-x8-MoE

This is a GGUF quantization of TinyLlama-1.1B-Chat-v0.6-x8-MoE.

Original Repo Link:

Original Repository

Original Model Card:

x8 MoE of https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v0.6

Downloads last month: 188

GGUF

Model size

6B params

Architecture

llama

Hardware compatibility

Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support