Qwen3-8B-GPTQ-Int8
Model Details
This model is a int8 model with group_size 128 of Qwen/Qwen3-8B generated by vastai modelzoo. Please follow the license of the original model.
vllm Inference
- vllm >= v0.11.0
- VVI >= 26.02
vllm serve vastai-ais/Qwen3-8B-GPTQ-Int8 --reasoning-parser qwen3 --served_model_name qwen
curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d ' {
"model": "qwen",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who are you?"}
],
"temperature": 1,
"max_tokens": 512
} '
- Downloads last month
- 29