Lower quants
#1
by Desm0nt - opened
Hello.
Is there any chances for 2.25 bpw quant? 2.4 is to huge for rope scaling on single 24gb gpu even with cache_4bit
Hello.
Is there any chances for 2.25 bpw quant? 2.4 is to huge for rope scaling on single 24gb gpu even with cache_4bit