Qwen3-14B-GPTQ-Int8

Model Details

This model is a int8 model with group_size 128 of Qwen/Qwen3-14B generated by vastai modelzoo. Please follow the license of the original model.

vllm Inference

  • vllm >= v0.11.0
  • VVI >= 26.02
vllm serve vastai-ais/Qwen3-14B-GPTQ-Int8 --reasoning-parser qwen3 --served_model_name qwen
curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d ' {
    "model": "qwen",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Who are you?"}
    ],
    "temperature": 1,
    "max_tokens": 512
  } '
Downloads last month
69
Safetensors
Model size
15B params
Tensor type
I32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for vastai-ais/Qwen3-14B-GPTQ-Int8

Finetuned
Qwen/Qwen3-14B
Quantized
(167)
this model