Voxtral Mini 4B Realtime — MLX 4-bit

4-bit quantized MLX conversion of Voxtral Mini 4B Realtime for Apple Silicon, converted with voxmlx.

Bits per weight: 4.52
Total parameters: 4.4B
Model size: ~2.3 GB

Conversion

Converted from the original Mistral weights using:

voxmlx-convert -q --bits 4 --group-size 64 --mlx-path voxtral-mlx-4bit

Usage

pip install voxmlx

# Transcribe a file
voxmlx --model T0mSIlver/Voxtral-Mini-4B-Realtime-2602-MLX-4bit --audio recording.wav

# Stream from microphone
voxmlx --model T0mSIlver/Voxtral-Mini-4B-Realtime-2602-MLX-4bit

Python API

from voxmlx import transcribe

text = transcribe("audio.flac", model_path="T0mSIlver/Voxtral-Mini-4B-Realtime-2602-MLX-4bit")
print(text)

Downloads last month: 65

Safetensors

Model size

0.7B params

Tensor type

BF16

U32

MLX

Hardware compatibility

Quantized

Model tree for T0mSIlver/Voxtral-Mini-4B-Realtime-2602-MLX-4bit

Base model

mistralai/Ministral-3-3B-Base-2512

Finetuned

mistralai/Voxtral-Mini-4B-Realtime-2602

Quantized

(20)

this model