Quantization request.

by daibuzizai - opened Dec 10, 2025

Discussion

daibuzizai

Dec 10, 2025

Can ArliAI/GLM-4.6-Derestricted be quantized? The original V1 version was very efficient.

Downtown-Case

Owner Dec 10, 2025

•

edited Dec 10, 2025

I can try.

A caveat is that I don’t have ik_llama.cpp imatrix data for 4.6 derestricted. I could use a mainline imatrix file, but (last I checked) the conversion to ik_llama.cpp's format is quite lossy, so quality will take a hit.

I could use baseline llama.cpp quant formats, but without IQ3_KT, quality will take a huge hit .

I can try to make a ik native imatrix, but I’m not sure I can do it on 128GB RAM. But I will investigate this.

daibuzizai

Dec 11, 2025

Thank you very much for your work—it's a great help for people with limited hardware resources.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment