need INT8-INT4 version
2
#3 opened 9 days ago
by
clayboby
High First Token Latency Issue with AWQ-4bit Model Deployment Using vLLM
👍 2
3
#2 opened 14 days ago
by
Jeanxx
AWQ version of gemma-4-26B-A4B-it
🚀 3
5
#1 opened 19 days ago
by
ankandrew