DeepSeek-Coder-V2-Lite-Instruct-GGUF

GGUF quantizations of DeepSeek-Coder-V2-Lite-Instruct with imatrix calibration for improved accuracy at lower bit depths.

Available Quants

Quant Size Use Case
Q6_K ~13GB Best quality, high VRAM
Q5_K_M ~11GB Recommended balance
Q4_K_M ~9GB Low VRAM / fast inference

All quants use imatrix calibration data for better perplexity vs. standard quants.

Usage

Load with llama.cpp, Ollama, LM Studio, or any GGUF-compatible runtime.

Original Model

deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct

Downloads last month
953
GGUF
Model size
16B params
Architecture
deepseek2
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mad-lab-ai/deepseek-coder-v2-lite-instruct-GGUF

Quantized
(64)
this model