This model was converted to MLX format from ai-sage/GigaChat3.1-10B-A1.8B using oMLX v0.2.24 with oQ Quantization.

Multi-Token Prediction (MTP) had to be disabled ("num_nextn_predict_layers": 0) and related layers had to be removed (model.layers.26.*).

Settings:

Safetensors

Model size

3B params

Tensor type

U32

BF16

MLX

Hardware compatibility

8-bit

Model tree for deepsweet/GigaChat3.1-10B-A1.8B-MLX-oQ8

Base model

Quantized

Quantized

(4)

this model

Collection including deepsweet/GigaChat3.1-10B-A1.8B-MLX-oQ8