Mistral-7B-Instruct-v0.3

Mistral-7B-Instruct-v0.3 is an open-source, instruction-tuned large language model developed by Mistral AI, designed for high-quality conversational and instruction-following tasks. It builds on the base Mistral-7B-v0.3 model and introduces improved instruction performance, extended vocabulary support, and function-calling capabilities. For efficient local deployment, the model is also available in GGUF quantized variants (such as Q4_K_M and Q5_K_M), which significantly reduce memory usage and improve inference speed on CPUs and consumer-grade GPUs while largely preserving response quality, reasoning ability, and coding performance.


Overview

  • Model name: Mistral-7B-Instruct-v0.3
  • Base architecture: Transformer-based, decoder-only
  • Developer: Mistral AI
  • License: Apache-2.0
  • Quantized Versions:
    • Q4_K_M (4-bit quantization)
    • Q5_K_M (5-bit quantization)
  • Parameter count: ~7 billion
  • Tokenization: v3 tokenizer with extended vocabulary (32,768)
  • Type: Instruction-tuned LLM

Quantization Details

Q4_K_M

  • Approx. ~70% size reduction
  • Lower memory footprint (~4.07 GB)
  • Optimized for low-resource environments
  • Faster inference on CPU
  • Minor degradation in complex reasoning tasks

Q5_K_M

  • Approx. ~65% size reduction
  • Better fidelity to the original FP16 model (~4.78 GB)
  • Improved coherence and reasoning
  • Recommended when VRAM allows

Key Features

  • Instruction-tuned model
    Fine-tuned to follow natural language instructions with high quality.

  • Multi-turn dialogue
    Maintains conversational context across multiple interactions.

  • Coding and debugging assistance
    Generates, explains, and corrects code across popular languages.

  • Logical reasoning
    Handles structured reasoning and problem-solving tasks.

  • Extended token support
    Uses a large 32,768 token vocabulary for broad language coverage.

  • Function calling support
    Can perform structured calls to specified API or function patterns.


Training Details

Mistral-7B-Instruct-v0.3 is derived from the base Mistral-7B model and is further fine-tuned for instruction following. It benefits from an extended vocabulary and tokenizer improvements introduced in the v0.3 base checkpoint.

  • Base training: The base Mistral-7B model is trained with autoregressive language modeling on large, diverse text corpora.
  • Instruction tuning: This variant is fine-tuned on instruction datasets to improve chat and task completion quality.
  • Tokenizer: v3 tokenizer with larger vocabulary (32,768 tokens) improves text understanding.
  • Supports function calling: Enables structured actions as part of responses.

Usage

llama.cpp

./llama-cli \
  -m SandLogicTechnologies/Mistral-7B-Instruct-v0.3_Q4_k_m.gguf \
  -p "Explain transformers in simple terms."

Recommended Use Cases

  • Local AI assistants Offline conversational agents with instruction following.

  • Coding help and debugging Autosuggest and correct code interactively.

  • Research and reasoning tasks Assist with structured technical and analytical questions.

  • Function calling workflows Integrate model responses with structured actions.

  • General text generation & summarization High-quality completion of broad tasks.

Acknowledgments

These quantized models are based on the original work by mistralai development team.

Special thanks to:

  • The mistralai team for developing and releasing the Mistral-7B-Instruct-v0.3 model.

  • Georgi Gerganov and the entire llama.cpp open-source community for enabling efficient model quantization and inference via the GGUF format.


Contact

For any inquiries or support, please contact us at support@sandlogic.com or visit our Website. ```

Downloads last month
23
GGUF
Model size
7B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for SandLogicTechnologies/Mistral-7B-Instruct-v0.3-GGUF

Quantized
(241)
this model