does dflash support different quantize models and not only the base one?
Per earlier thread, official FP8 is transparent and supported.
By my experimentation, the int4-AutoRound works also.
Other quants are the wild west.
· Sign up or log in to comment