Hey there,
Could you please explain how the quantization is done (is it modelopt) and how is this measured: "KLD reduced by ~10%." and is your calibration dataset on huggingface?
thanks!
· Sign up or log in to comment