Qwen
/

Qwen3.5-35B-A3B-GPTQ-Int4

Image-Text-to-Text

4-bit precision

Model card Files Files and versions

Resources

View closed (0)

fix chat template to avoid empty historical `<think>` blocks

#11 opened 14 days ago by

latent-variable

Why 35B-INT4 smaller than 27B-INT4

#10 opened about 1 month ago by

Qwen3.5-35B-A3B-Base model quants

#9 opened about 1 month ago by

Smaller model quants

#8 opened about 2 months ago by

Vllm did not recognise the model

#7 opened about 2 months ago by

Sglang config request

#6 opened about 2 months ago by

vllm (SM70) V100 support

#5 opened about 2 months ago by

What impact has quantization had on model performance / ability?

#4 opened about 2 months ago by

Working vLLM setup on RTX 5090 — 194-197 tok/s with image/video

#3 opened about 2 months ago by

Why are GPTQ scales stored as float16 while other weights are bfloat16?

#2 opened about 2 months ago by

Speculative Config - MTP Crash related to quantized expert names

#1 opened about 2 months ago by

seanthomaswilliams