Mistral-Small-3.2-24B-Instruct-2506 (NVFP4)
This repository contains an NVFP4 quantization of the following base model:
- Base model:
mistral/Mistral-Small-3.2-24B-Instruct-2506 - Quantized model:
yepthatsjason/Mistral-Small-3.2-24B-Instruct-2506-nvfp4 - Quantization: NVFP4
- Quantized with:
llmcompressor
What is this?
This is a quantized version of the base model intended to reduce memory usage and improve inference efficiency, while keeping behavior close to the original.
Usage
Add your exact loading snippet here (it depends on how
llmcompressorexported the artifacts and which runtime you鈥檙e using).
Quantization details
- Format: NVFP4
- Tooling: llmcompressor
- Notes: (add any relevant settings, e.g. target hardware, calibration details, etc.)
Limitations / caveats
Quantized models can differ from the base model in edge cases. If you observe regressions, please compare against the base model and share a minimal repro.
- Downloads last month
- 37
Model tree for yepthatsjason/Mistral-Small-3.2-24B-Instruct-2506-nvfp4
Base model
mistralai/Mistral-Small-3.1-24B-Base-2503