quant_method Question?

by glrra30 - opened 13 days ago

Why is "quant_method": "compressed-tensors",

instead of:

"quant_method": "awq",

? Throws errors for me with vLLM ( --quantization awq )

phaedawg

12 days ago

Why is "quant_method": "compressed-tensors",

instead of:
"quant_method": "awq",
? Throws errors for me with vLLM ( --quantization awq )

Because Cyanwiki used llm_compressor's Quantization capability for this model. LLM_Compressor maintains a library which is called "Compressed-Tensors" and thus, it is AWQ, but it requires the Compressed-Tensors library to properly load.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment