Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Qwen
/
Qwen3.5-35B-A3B-GPTQ-Int4

Image-Text-to-Text
Transformers
Safetensors
qwen3_5_moe
conversational
4-bit precision
gptq
Model card Files Files and versions
xet
Community
11
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

fix chat template to avoid empty historical `<think>` blocks

1
#11 opened 14 days ago by
latent-variable

Why 35B-INT4 smaller than 27B-INT4

3
#10 opened about 1 month ago by
andynoodles

Qwen3.5-35B-A3B-Base model quants

#9 opened about 1 month ago by
Maksim1000

Smaller model quants

#8 opened about 2 months ago by
swtb

Vllm did not recognise the model

#7 opened about 2 months ago by
anura2026

Sglang config request

#6 opened about 2 months ago by
cse2011

vllm (SM70) V100 support

2
#5 opened about 2 months ago by
FayeQuant

What impact has quantization had on model performance / ability?

1
#4 opened about 2 months ago by
spanspek

Working vLLM setup on RTX 5090 β€” 194-197 tok/s with image/video

πŸš€πŸ‘ 2
5
#3 opened about 2 months ago by
8055izham

Why are GPTQ scales stored as float16 while other weights are bfloat16?

πŸ‘€ 1
#2 opened about 2 months ago by
mylfm

Speculative Config - MTP Crash related to quantized expert names

βž• 8
2
#1 opened about 2 months ago by
seanthomaswilliams
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs