Llama 3.2B — Quantized (q8_0, GGUF)

This repository provides an 8-bit quantized version of Meta's Llama 3.2B for efficient deployment on resource-constrained environments (CPU and small GPUs). The GGUF file uses q8_0 quantization (8-bit) — a good tradeoff between size and quality for small models.Please refer to the original model card for full details on its capabilities and limitations.

Base model: Llama 3.2B (Meta AI)
Quantization: 8-bit Post-Training Quantization (q8_0) — GGUF.
Format: GGUF (compatible with llama.cpp, GPT4All, Ollama).
Model file: llama_3.2_3b_q8_k_m.gguf

Usage (llama.cpp)

# Run with llama.cpp
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make

# then:
./main -m ./llama_3.2_3b_q8_k_m.gguf -p "Hello, how are you?"

Download

You can download this model directly via:

git lfs install
git clone https://huggingface.co/navyaparesh/llama-3.2-3b-q8-k-m

Or programmatically:

from huggingface_hub import snapshot_download
snapshot_download(repo_id="navyaparesh/llama-3.2-3b-q4-k-m", local_dir="models/llama3-quantized")

Downloads last month: -

GGUF

Model size

3B params

Architecture

llama

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for navyaparesh/llama-3.2-3b-q8-k-m

Base model

meta-llama/Llama-3.2-3B-Instruct

Quantized

(438)

this model