Llama 3.2B β Quantized (q8_0, GGUF)
This repository provides an 8-bit quantized version of Meta's Llama 3.2B for efficient deployment on resource-constrained environments (CPU and small GPUs). The GGUF file uses q8_0 quantization (8-bit) β a good tradeoff between size and quality for small models.Please refer to the original model card for full details on its capabilities and limitations.
Base model: Llama 3.2B (Meta AI)
Quantization: 8-bit Post-Training Quantization (q8_0) β GGUF.
Format: GGUF (compatible with llama.cpp, GPT4All, Ollama).
Model file: llama_3.2_3b_q8_k_m.gguf
Usage (llama.cpp)
# Run with llama.cpp
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make
# then:
./main -m ./llama_3.2_3b_q8_k_m.gguf -p "Hello, how are you?"
Download
You can download this model directly via:
git lfs install
git clone https://huggingface.co/navyaparesh/llama-3.2-3b-q8-k-m
Or programmatically:
from huggingface_hub import snapshot_download
snapshot_download(repo_id="navyaparesh/llama-3.2-3b-q4-k-m", local_dir="models/llama3-quantized")
- Downloads last month
- -
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support
Model tree for navyaparesh/llama-3.2-3b-q8-k-m
Base model
meta-llama/Llama-3.2-3B-Instruct