2.4-2.5 bpw request?

#1
by deleted - opened
deleted

Heya, any chance we can get a 2.4/2.5 quant? That places the model just beyond the massive inflection point of KL divergence (at least for devstral https://huggingface.co/turboderp/Devstral-2-123B-Instruct-2512-exl3)
image
and also around the perfect amount for 80GB VRAM

Owner

I'm going to do some optimized versions shortly.

Owner

Optimized quants have been added. Let me know if there are any other sizes you'd like to see.

Thanks for putting this together, @MikeRoz :)

Sign up or log in to comment