New discussion

Update README.md

πŸ‘ 2
#34 opened 5 days ago by
dyoung

IQ3XXS giving garbage output

2
#33 opened 5 days ago by
mbhagya

inference using vllm

#31 opened 6 days ago by
kuopching

the trade off is not good

3
#28 opened 6 days ago by
rosspanda0

FAST!!!! 39tps!

πŸ€— 1
#16 opened 10 days ago by
mazuj2

--n-gpu-layers or -ngl

#12 opened 12 days ago by
owao

is it work for gemma4?

#9 opened 12 days ago by
koyukira

presence-penalty

4
#8 opened 12 days ago by
owao