Add EXL2, INT8, and/or INT4 version of the model, PLEASE!

#21

by Abdelhak - opened Sep 24, 2024

Sep 24, 2024

The model is too big to run for people with less than 24GB. Please, make a quantized version of it.

Abdelhak changed discussion title from Add am EXL2, INT8, and/or INT4 of the model, PLEASE! to Add EXL2, INT8, and/or INT4 version of the model, PLEASE! Sep 24, 2024

Valadaro

Sep 24, 2024

It is taking 60GB of ram for me, and taking around 15 minutes to process each prompt, running on CPU. We really need a Quantized version

Abdelhak

Sep 26, 2024

There is an nf4 version here:
https://huggingface.co/mistralai/Pixtral-12B-2409/discussions/21#66f347780dc1833d4e484073

gghfez

Sep 30, 2024

exllamav2 doesn't support vision fwiw

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment