FP8 static quantized from INSAIT-Institute/MamayLM-Gemma-3-12B-IT-v1.0 and calibrated on HuggingFaceH4/ultrachat_200k using llmcompressor/AutoFP8-style flow.
Usage
Designed for vLLM inference (H100).
trust_remote_code=True.
Downloads last month
5
Safetensors
Model size
12B params
Tensor type
BF16
·
F8_E4M3
·
Model tree for oshkorinova/MamayLM-Gemma-3-12B-IT-v1.0-FP8-Static-Ultrachat-200k