dflash with quantize model

#5
by Shimon324 - opened

does dflash support different quantize models and not only the base one?

Per earlier thread, official FP8 is transparent and supported.

By my experimentation, the int4-AutoRound works also.

Other quants are the wild west.

Sign up or log in to comment