The new upload with llama.cpp 8665 backend still gets only 50% score in winogrande test

#2
by qdwang - opened

BTW, is it possible to add another quant with size around 17.5G? It's the sweet spot for mac devices with 24G ram.

These type of devices can only use around 20G-22G ram for running LLM, so 20G model is too large with a useful context size and 15.5G is a little waste also.

Sign up or log in to comment