Does this work with vllm?
I'm trying to find a quant - gpt or awq (prefferable gptq, if it doesn't work then awq) of this model: Qwen3-30B-A3B-Instruct-2507
Does this quant works in vllm?
This quant does work with vLLM (checked with v0.10.0)
@Ainonake At least based on my test/environment, it works. If it doesn’t work in your case, please open a discussion! I’m more than happy to help.
If you prefer to message, please shoot me a mail to cpatonn@gmail.com!
Thanks, I've tried and it works
Are there any other quantitative methods to solve this problem?
ValidationError: 1 validation error for VllmConfig
Value error, The quantization method compressed-tensors is not supported for the current GPU. Minimum capability: 70. Current capability: 60. [type=value_error, input_value=ArgsKwargs((), {'model_co...additional_config': {}}), input_type=ArgsKwargs]
For further information visit https://errors.pydantic.dev/2.11/v/value_error Validation Errors - Pydantic Validation Errors - Pydantic
As a heavy-duty developer running these models on RTX 5090, CaptainN significantly outperforms QuantRIO AWQ version.