Qwen3-8B-GPTQ-Int8

Model Details

This model is a int8 model with group_size 128 of Qwen/Qwen3-8B generated by vastai modelzoo. Please follow the license of the original model.

vllm Inference

vllm >= v0.11.0
VVI >= 26.02

vllm serve vastai-ais/Qwen3-8B-GPTQ-Int8 --reasoning-parser qwen3 --served_model_name qwen

curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d ' {
    "model": "qwen",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Who are you?"}
    ],
    "temperature": 1,
    "max_tokens": 512
  } '

Downloads last month: 29

Safetensors

Model size

8B params

Tensor type

I32

BF16

Model tree for vastai-ais/Qwen3-8B-GPTQ-Int8

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Quantized

(267)

this model