Ministral-3-3B-Reasoning-2512-heretic-GGUF

GGUF quantizations of coder3101/Ministral-3-3B-Reasoning-2512-heretic for use with llama.cpp and compatible tools.

Model Description

This is a fine-tuned version of Mistral's Ministral-3-3B-Reasoning-2512 vision-language model. It supports:

  • Text generation with reasoning capabilities (uses [THINK] tokens)
  • Vision/Image understanding (requires the mmproj file)
  • Tool/Function calling

Available Quantizations

Quantization Size Description
BF16 6.4 GB Full precision (bfloat16)
Q8_0 3.4 GB 8-bit quantization
Q5_K_M 2.3 GB 5-bit K-quant (medium)
Q4_K_M 2.0 GB 4-bit K-quant (medium) - Recommended

Vision Support

For vision/image understanding, you need to download the mmproj (multimodal projector) file:

  • Ministral-3-3B-Reasoning-2512-heretic-mmproj-bf16.gguf (811 MB)

Chat Template

The model includes a custom chat template with reasoning support. The format uses:

  • [SYSTEM_PROMPT]...[/SYSTEM_PROMPT] - System message
  • [INST]...[/INST] - User messages
  • [THINK]...[/THINK] - Model's reasoning/thinking process
  • [IMG] - Image placeholder for vision inputs
  • [TOOL_CALLS] and [TOOL_RESULTS] - For function calling

Example conversation:

[SYSTEM_PROMPT]You are a helpful assistant.[/SYSTEM_PROMPT][INST]What is 2+2?[/INST][THINK]The user is asking for a simple arithmetic calculation. 2+2=4.[/THINK]The answer is 4.

Usage

Text-only (CLI)

llama-cli -m Ministral-3-3B-Reasoning-2512-heretic-Q4_K_M.gguf \
  -p "[INST]What is the capital of France?[/INST]" \
  -n 256

With Vision Support

llama-mtmd-cli \
  -m Ministral-3-3B-Reasoning-2512-heretic-Q4_K_M.gguf \
  --mmproj Ministral-3-3B-Reasoning-2512-heretic-mmproj-bf16.gguf \
  -p "Describe this image in detail." \
  --image /path/to/image.jpg

With llama-server (OpenAI-compatible API)

llama-server \
  -m Ministral-3-3B-Reasoning-2512-heretic-Q4_K_M.gguf \
  --mmproj Ministral-3-3B-Reasoning-2512-heretic-mmproj-bf16.gguf \
  --port 8080

Then query the API:

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "ministral", "messages": [{"role": "user", "content": "What is 2+2?"}]}'

Original Model

This GGUF is based on:

License

Apache 2.0

Downloads last month
274
GGUF
Model size
3B params
Architecture
mistral3
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for coder3101/Ministral-3-3B-Reasoning-2512-heretic-GGUF

Collection including coder3101/Ministral-3-3B-Reasoning-2512-heretic-GGUF