This is a Ministral-3-3B-Instruct-2512 fine-tune, produced through a slightly modified version of P-E-W's Heretic (v1.1.0) abliteration engine merged with the Magnitude-Preserving Orthogonal Ablation PR.

Note: This heretic model is highly uncensored; thus use it with extreme caution and care. Requires Transformers v5 to interface.

Heretication Results

Score Metric	Value	Parameter	Value
Refusals	2/100	direction_index	14.81
KL Divergence	0.0284	attn.o_proj.max_weight	0.87
		attn.o_proj.max_weight_position	19.93
		attn.o_proj.min_weight	0.60
		attn.o_proj.min_weight_distance	14.43
		mlp.down_proj.max_weight	1.44
		mlp.down_proj.max_weight_position	19.00
		mlp.down_proj.min_weight	0.93
		mlp.down_proj.min_weight_distance	13.80

Degree of Heretication

The Heresy Index weighs the resulting model's corruption by the process (KL Divergence) and its abolition of doctrine (Refusals) for a final verdict in classification.

Index Entry	Classification	Analysis
	Absolute Heresy	Less than 10/100 Refusals and 0.10 KL Divergence
	Tainted Heresy	Around 25-11/100 Refusals and/or -0.20-0.11 KL Divergence
	Impotent Heresy	Anything above 25/100 Refusals and 0.21 KL Divergence

Note: This is an arbitrary classification inspired by Warhammer 40K, having no tangible indication towards the model's performance.

Ministral 3 3B Instruct 2512 BF16

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.

This model is the instruct post-trained version, fine-tuned for instruction tasks, making it ideal for chat and instruction based use cases.

The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 3B can even be deployed locally, capable of fitting in 16GB of VRAM in BF16, and less than 8GB of RAM/VRAM when quantized.

We provide a no-loss FP8 version here, you can find other formats and quantizations in the Ministral 3 - Additional Checkpoints collection.

Learn more in our blog post and paper.

Key Features

Ministral 3 3B consists of two main architectural components:

3.4B Language Model
0.4B Vision Encoder

The Ministral 3 3B Instruct model offers the following capabilities:

Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
System Prompt: Maintains strong adherence and support for system prompts.
Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
Large Context Window: Supports a 256k context window.

Use Cases

Ideal for lightweight, real-time applications on edge or low-resource devices, such as:

Image captioning
Text classification
Real-time efficient translation
Data extraction
Short content generation
Fine-tuning and specialization
And more...

Bringing advanced AI capabilities to edge and distributed environments for embedded systems.

Ministral 3 Family

Model Name	Type	Precision	Link
Ministral 3 3B Base 2512	Base pre-trained	BF16	Hugging Face
Ministral 3 3B Instruct 2512	Instruct post-trained	BF16	Hugging Face
Ministral 3 3B Reasoning 2512	Reasoning capable	BF16	Hugging Face
Ministral 3 8B Base 2512	Base pre-trained	BF16	Hugging Face
Ministral 3 8B Instruct 2512	Instruct post-trained	BF16	Hugging Face
Ministral 3 8B Reasoning 2512	Reasoning capable	BF16	Hugging Face
Ministral 3 14B Base 2512	Base pre-trained	BF16	Hugging Face
Ministral 3 14B Instruct 2512	Instruct post-trained	BF16	Hugging Face
Ministral 3 14B Reasoning 2512	Reasoning capable	BF16	Hugging Face

Other formats available here.

Benchmark Results

We compare Ministral 3 to similar sized models.

Reasoning

Model	AIME25	AIME24	GPQA Diamond	LiveCodeBench
Ministral 3 14B	0.850	0.898	0.712	0.646
Qwen3-14B (Thinking)	0.737	0.837	0.663	0.593

Ministral 3 8B	0.787	0.860	0.668	0.616
Qwen3-VL-8B-Thinking	0.798	0.860	0.671	0.580

Ministral 3 3B	0.721	0.775	0.534	0.548
Qwen3-VL-4B-Thinking	0.697	0.729	0.601	0.513

Instruct

Model	Arena Hard	WildBench	MATH Maj@1	MM MTBench
Ministral 3 14B	0.551	68.5	0.904	8.49
Qwen3 14B (Non-Thinking)	0.427	65.1	0.870	NOT MULTIMODAL
Gemma3-12B-Instruct	0.436	63.2	0.854	6.70

Ministral 3 8B	0.509	66.8	0.876	8.08
Qwen3-VL-8B-Instruct	0.528	66.3	0.946	8.00

Ministral 3 3B	0.305	56.8	0.830	7.83
Qwen3-VL-4B-Instruct	0.438	56.8	0.900	8.01
Qwen3-VL-2B-Instruct	0.163	42.2	0.786	6.36
Gemma3-4B-Instruct	0.318	49.1	0.759	5.23

Base

Model	Multilingual MMLU	MATH CoT 2-Shot	AGIEval 5-shot	MMLU Redux 5-shot	MMLU 5-shot	TriviaQA 5-shot
Ministral 3 14B	0.742	0.676	0.648	0.820	0.794	0.749
Qwen3 14B Base	0.754	0.620	0.661	0.837	0.804	0.703
Gemma 3 12B Base	0.690	0.487	0.587	0.766	0.745	0.788

Ministral 3 8B	0.706	0.626	0.591	0.793	0.761	0.681
Qwen 3 8B Base	0.700	0.576	0.596	0.794	0.760	0.639

Ministral 3 3B	0.652	0.601	0.511	0.735	0.707	0.592
Qwen 3 4B Base	0.677	0.405	0.570	0.759	0.713	0.530
Gemma 3 4B Base	0.516	0.294	0.430	0.626	0.589	0.640

License

This model is licensed under the Apache 2.0 License.

You must not use this model in a manner that infringes, misappropriates, or otherwise violates any third party’s rights, including intellectual property rights.