Was Q8_0 quantized from original model or is it a copy of -FP8 variant?

#3
by lostmsu - opened

Would highly prefer the latter as re-release of the official.

It's from the original model, using FP8 wouldn't work nicely because we'd have to cast it back up to BF16 before quantizing anyways

llama.cpp doesn't have an FP8 quant type, it doesn't map 1:1 to Q8_0

Sign up or log in to comment