Hia thanks for bringing this model some more reasoning.
testing on a 5090 at q8_0 sampling : temp0, topk20, repeat1 ,top p 0.95min p0,
This is the most coherent 27b model even better than some 200b models now
Thank you for your support!
How many tokens / s you have on your hardware?
Β· Sign up or log in to comment