Mistral-7B-Instruct-v0.3

Mistral-7B-Instruct-v0.3 is an open-source, instruction-tuned large language model developed by Mistral AI, designed for high-quality conversational and instruction-following tasks. It builds on the base Mistral-7B-v0.3 model and introduces improved instruction performance, extended vocabulary support, and function-calling capabilities. For efficient local deployment, the model is also available in GGUF quantized variants (such as Q4_K_M and Q5_K_M), which significantly reduce memory usage and improve inference speed on CPUs and consumer-grade GPUs while largely preserving response quality, reasoning ability, and coding performance.

Overview

Model name: Mistral-7B-Instruct-v0.3
Base architecture: Transformer-based, decoder-only
Developer: Mistral AI
License: Apache-2.0
Quantized Versions:
- Q4_K_M (4-bit quantization)
- Q5_K_M (5-bit quantization)
Parameter count: ~7 billion
Tokenization: v3 tokenizer with extended vocabulary (32,768)
Type: Instruction-tuned LLM

Quantization Details

Q4_K_M

Approx. ~70% size reduction
Lower memory footprint (~4.07 GB)
Optimized for low-resource environments
Faster inference on CPU
Minor degradation in complex reasoning tasks

Q5_K_M

Approx. ~65% size reduction
Better fidelity to the original FP16 model (~4.78 GB)
Improved coherence and reasoning
Recommended when VRAM allows

Key Features

Instruction-tuned model
Fine-tuned to follow natural language instructions with high quality.
Multi-turn dialogue
Maintains conversational context across multiple interactions.
Coding and debugging assistance
Generates, explains, and corrects code across popular languages.
Logical reasoning
Handles structured reasoning and problem-solving tasks.
Extended token support
Uses a large 32,768 token vocabulary for broad language coverage.
Function calling support
Can perform structured calls to specified API or function patterns.

Training Details

Mistral-7B-Instruct-v0.3 is derived from the base Mistral-7B model and is further fine-tuned for instruction following. It benefits from an extended vocabulary and tokenizer improvements introduced in the v0.3 base checkpoint.

Base training: The base Mistral-7B model is trained with autoregressive language modeling on large, diverse text corpora.
Instruction tuning: This variant is fine-tuned on instruction datasets to improve chat and task completion quality.
Tokenizer: v3 tokenizer with larger vocabulary (32,768 tokens) improves text understanding.
Supports function calling: Enables structured actions as part of responses.

Usage

llama.cpp

./llama-cli \
  -m SandLogicTechnologies/Mistral-7B-Instruct-v0.3_Q4_k_m.gguf \
  -p "Explain transformers in simple terms."

Recommended Use Cases

Local AI assistants Offline conversational agents with instruction following.
Coding help and debugging Autosuggest and correct code interactively.
Research and reasoning tasks Assist with structured technical and analytical questions.
Function calling workflows Integrate model responses with structured actions.
General text generation & summarization High-quality completion of broad tasks.

Acknowledgments

These quantized models are based on the original work by mistralai development team.

Special thanks to:

The mistralai team for developing and releasing the Mistral-7B-Instruct-v0.3 model.
Georgi Gerganov and the entire llama.cpp open-source community for enabling efficient model quantization and inference via the GGUF format.

Contact

For any inquiries or support, please contact us at support@sandlogic.com or visit our Website. ```

Downloads last month: 23

GGUF

Model size

7B params

Architecture

llama

Hardware compatibility

4-bit

5-bit

Model tree for SandLogicTechnologies/Mistral-7B-Instruct-v0.3-GGUF

Base model

mistralai/Mistral-7B-v0.3

Finetuned

mistralai/Mistral-7B-Instruct-v0.3

Quantized

(241)

this model