Qwen3-4B-Thinking-2507-GPTQ-INT4
Model Details
This model is a int4 model with group_size 128 of Qwen/Qwen3-4B-Thinking-2507 generated by vastai modelzoo. Please follow the license of the original model.
vllm Inference
- vllm >= v0.11.0
- VVI >= 26.02
vllm serve vastai-ais/Qwen3-4B-Thinking-2507-GPTQ-INT4 --reasoning-parser qwen3 --served_model_name qwen
curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d ' {
"model": "qwen",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who are you?"}
],
"temperature": 1,
"max_tokens": 512
} '
- Downloads last month
- 6
Model tree for vastai-ais/Qwen3-4B-Thinking-2507-GPTQ-INT4
Base model
Qwen/Qwen3-4B-Thinking-2507