fix chat template to avoid empty historical `<think>` blocks
1
#5 opened 13 days ago
by
latent-variable
GPTQ-Int4 模型能力相比于 FP8 差多少呢?
#4 opened about 1 month ago
by
Artom
Poor performance in vLLM
4
#3 opened about 1 month ago
by
sinebubble
GPTQ vs Q4 GGUF
👀 3
1
#2 opened about 2 months ago
by
ciprianv
Benchmark numbers of this quant version
👀 6
#1 opened about 2 months ago
by
celsowm