------------------------------------------------
- Model Details and Specifications: -
------------------------------------------------
Ministral-3 14B Reasoning 2512 (GGUF)

--------------------

NOTICE:

I noticed after testing (post-upload)
that the template doesn't like to play nice
(doesn't seem to engage the thinking tags correctly)
when pulled from HuggingFace,
I will be correcting this today/tomorrow at the latest!

--------------------

This release contains:
Llama.cpp and Ollama compatible GGUF converted and Quantized model files (Compatible with both Ollama, and Llama.cpp)

Quantized GGUF version of:

Ministral-3-14B-Reasoning-2512-BF16
(by MistralAI)

Original Model Link:

mistralai/Ministral-3-14B-Reasoning-2512

Description:
This release includes GGUF (Ollama + Llama.cpp - compatible) model files and two working multi-modal projector(s) (mmproj) files for the Vision Projector; offering full capabilities in Ollama or Llama.cpp.

What is the "Custom Tokenizer Chat Template?"
As apposed to the standard "Chat Template" made available by MistralAI - this release of GGUF converted and quantized files offer a totally custom Tokenizer Chat Template in order to provide: Smoother, Faster, Efficient, and Reliable interaction/inference with the model. This template sheds the "fluff" or non-primary logic from the JINJA Chat/Tokenizer Template - allowing anyone who uses the model for inferencing the opportunity to enjoy a significant improvement in speed, quality and context adherence without sacrificing any aspect of the initial release by MistralAI.

For reference - here is the new JINJA Tokenizer Chat-Template:
(This template features a sliding context window of FORTY-SIX (46) interactions, which may be adjusted per-individual requirements simply by altering the fourth (4th) line of this template, upwards from the number forty-seven (47) to either higher or lower numerical values to increase or decrease the sliding context window)

{{- $remMessage := false }}
[SYSTEM_PROMPT]{{- "🟦 Follow instructions that the user provides. Think and respond to the user in the language they use or request. Next sections describes the capabilities that you have. \n\n🟦 [Reasoning Instructions]\nYou have the ability to think before responding to the user. Always start your response by thinking, using an internal monologue. Always use this template when you respond: <think> thoughts and internal monologue </think> then respond directly to the user.\n\n🟦 [Multi-Modal Instructions]\nYou have the ability to read images." }}[/SYSTEM_PROMPT]
{{- range $index, $_ := .Messages }}
{{- if lt (len (slice $.Messages $index)) 47 }}
{{- $remMessage = true }}
{{- end }}
{{- if $remMessage }}{{- if eq .Role "user" }}
[INST]{{ .Content }}[/INST]
{{- else if eq .Role "assistant" }}
{{ .Content }}{{- end }}
{{- end }}
{{- end }}

No modifications, edits, or configurations are required to use this model with Ollama or llama.cpp, it works natively! Both Vision and Text work with Ollama as well. (^.^)

Coming Soon!!!
Check back occasionally - as a automated installer/configure Python-3 script is making its way to all of my releases! This allows anyone who is interested in using these models a hassle-free and stress-free experience where the Python-3 script takes care of setting up the model for Ollama (specifically for Ollama, other software optimizations coming later). It is highly recommended to use the Ollama "create" command along with the supplied ".modelfile" to ensure proper configuration for anyone who wishes to get the most out of this particular release. Though, the Python-3 automated installer/configuration tool will handle such aspects if it is chosen to be used.

Happy Inferencing!
-- Jon Z (EnlistedGhost)

Model Updates (As of: Match 26, 2026)

Updated: Uploaded/Added all GGUF conversion(s) and non-i-matrix Quantized model file(s)
Final Quantized and full-F16 modelfiles are uploaded!!! - Check back for i-Matrix quant model files if you do not see your desired edition (They are being uploaded, thank you for your patience!)

-------------------------------------------------------------
- GGUF Conversion and Quantization Details: -
-------------------------------------------------------------

Software used to convert Safetensors to GGUF:

llama.cpp | Version: 8189

Software used to create Quantized GGUF Files:

llama.cpp | Version: 8189

Specific GitHub Commit Point:

b8189

Converted to GGUF and Quantized by:

EnlistedGhost

--------------------------
---- Original Info ----
--------------------------

(Crossposted from the link in the above section: "Model Details"):

Ministral 3 14B Reasoning 2512 BF16

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart. A powerful and efficient language model with vision capabilities.

This model is the reasoning post-trained version, trained for reasoning tasks, making it ideal for math, coding and stem related use cases.

The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 14B can even be deployed locally, capable of fitting in 32GB of VRAM in BF16, and less than 24GB of RAM/VRAM when quantized.

Learn more in our blog post and paper.

Key Features

Ministral 3 14B consists of two main architectural components:

13.5B Language Model
0.4B Vision Encoder

The Ministral 3 14B Reasoning model offers the following capabilities:

Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
System Prompt: Maintains strong adherence and support for system prompts.
Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
Reasoning: Excels at complex, multi-step reasoning and dynamic problem-solving.
Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
Large Context Window: Supports a 256k context window.

Use Cases

Private AI deployments where advanced capabilities meet practical hardware constraints:

Private/custom chat and AI assistant deployments in constrained environments
Advanced local agentic use cases
Fine-tuning and specialization
And more...

Bringing advanced AI capabilities to most environments.

Recommended Settings

We recommend deploying with the following best practices:

System Prompt: Use our provided system prompt, and append it to your custom system prompt to define a clear environment and use case, including guidance on how to effectively leverage tools in agentic systems.
Multi-turn Traces: We highly recommend keeping the reasoning traces in context.
Sampling Parameters: Use a temperature of 1 for most environments ; Different temperatures may be explored for different use cases - developers are encouraged to experiment with alternative settings.
Tools: Keep the set of tools well-defined and limit their number to the minimum required for the use case - Avoiding overloading the model with an excessive number of tools.
Vision: When deploying with vision capabilities, we recommend maintaining an aspect ratio close to 1:1 (width-to-height) for images. Avoiding the use of overly thin or wide images - crop them as needed to ensure optimal performance.

Ministral 3 Family

Model Name	Type	Precision	Link
Ministral 3 3B Base 2512	Base pre-trained	BF16	Hugging Face
Ministral 3 3B Instruct 2512	Instruct post-trained	FP8	Hugging Face
Ministral 3 3B Reasoning 2512	Reasoning capable	BF16	Hugging Face
Ministral 3 8B Base 2512	Base pre-trained	BF16	Hugging Face
Ministral 3 8B Instruct 2512	Instruct post-trained	FP8	Hugging Face
Ministral 3 8B Reasoning 2512	Reasoning capable	BF16	Hugging Face
Ministral 3 14B Base 2512	Base pre-trained	BF16	Hugging Face
Ministral 3 14B Instruct 2512	Instruct post-trained	FP8	Hugging Face
Ministral 3 14B Reasoning 2512	Reasoning capable	BF16	Hugging Face

Other formats available here.

Benchmark Results

We compare Ministral 3 to similar sized models.

Reasoning

Model	AIME25	AIME24	GPQA Diamond	LiveCodeBench
Ministral 3 14B	0.850	0.898	0.712	0.646
Qwen3-14B (Thinking)	0.737	0.837	0.663	0.593

Ministral 3 8B	0.787	0.860	0.668	0.616
Qwen3-VL-8B-Thinking	0.798	0.860	0.671	0.580

Ministral 3 3B	0.721	0.775	0.534	0.548
Qwen3-VL-4B-Thinking	0.697	0.729	0.601	0.513

Instruct

Model	Arena Hard	WildBench	MATH Maj@1	MM MTBench
Ministral 3 14B	0.551	68.5	0.904	8.49
Qwen3 14B (Non-Thinking)	0.427	65.1	0.870	NOT MULTIMODAL
Gemma3-12B-Instruct	0.436	63.2	0.854	6.70

Ministral 3 8B	0.509	66.8	0.876	8.08
Qwen3-VL-8B-Instruct	0.528	66.3	0.946	8.00

Ministral 3 3B	0.305	56.8	0.830	7.83
Qwen3-VL-4B-Instruct	0.438	56.8	0.900	8.01
Qwen3-VL-2B-Instruct	0.163	42.2	0.786	6.36
Gemma3-4B-Instruct	0.318	49.1	0.759	5.23

Base

Model	Multilingual MMLU	MATH CoT 2-Shot	AGIEval 5-shot	MMLU Redux 5-shot	MMLU 5-shot	TriviaQA 5-shot
Ministral 3 14B	0.742	0.676	0.648	0.820	0.794	0.749
Qwen3 14B Base	0.754	0.620	0.661	0.837	0.804	0.703
Gemma 3 12B Base	0.690	0.487	0.587	0.766	0.745	0.788

Ministral 3 8B	0.706	0.626	0.591	0.793	0.761	0.681
Qwen 3 8B Base	0.700	0.576	0.596	0.794	0.760	0.639

Ministral 3 3B	0.652	0.601	0.511	0.735	0.707	0.592
Qwen 3 4B Base	0.677	0.405	0.570	0.759	0.713	0.530
Gemma 3 4B Base	0.516	0.294	0.430	0.626	0.589	0.640

License

This model is licensed under the Apache 2.0 License.

You must not use this model in a manner that infringes, misappropriates, or otherwise violates any third party’s rights, including intellectual property rights.

Downloads last month: 874

GGUF

Model size

14B params

Architecture

mistral3

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Model tree for EnlistedGhost/Ministral-3-14B-Reasoning-2512-GGUF

Base model

mistralai/Ministral-3-14B-Base-2512

Finetuned

mistralai/Ministral-3-14B-Reasoning-2512

Quantized

(31)

this model

Dataset used to train EnlistedGhost/Ministral-3-14B-Reasoning-2512-GGUF

Paper for EnlistedGhost/Ministral-3-14B-Reasoning-2512-GGUF

Ministral 3

Paper • 2601.08584 • Published Jan 13 • 60

--------------------

NOTICE: I noticed after testing (post-upload) that the template doesn't like to play nice (doesn't seem to engage the thinking tags correctly) when pulled from HuggingFace, I will be correcting this today/tomorrow at the latest!

--------------------

Model Updates (As of: Match 26, 2026)

------------------------------------------------------------- - GGUF Conversion and Quantization Details: --------------------------------------------------------------

-------------------------- ---- Original Info ---- --------------------------

Ministral 3 14B Reasoning 2512 BF16

Key Features

Use Cases

Recommended Settings

Ministral 3 Family

Benchmark Results

Reasoning

Instruct

Base

License

Model tree for EnlistedGhost/Ministral-3-14B-Reasoning-2512-GGUF

Dataset used to train EnlistedGhost/Ministral-3-14B-Reasoning-2512-GGUF

Paper for EnlistedGhost/Ministral-3-14B-Reasoning-2512-GGUF

NOTICE:

I noticed after testing (post-upload)
that the template doesn't like to play nice
(doesn't seem to engage the thinking tags correctly)
when pulled from HuggingFace,
I will be correcting this today/tomorrow at the latest!

-------------------------------------------------------------
- GGUF Conversion and Quantization Details: -
-------------------------------------------------------------

--------------------------
---- Original Info ----
--------------------------