Dan, please address the intermediate FP16 clipping of your BF16 based model Quants?

#5
by RyanoSaurus-Wrex - opened

Hey Dan, since I can't reopen the thread, you closed before I even woke up this morning.

that is false That is false.

https://huggingface.co/bartowski/gemma-2-27b-it-GGUF/discussions/7

Here's an example. That's Gemma 2 27B, That's Bartowski looking into this very thing a year and a half ago. And it's supported on everything 30 series and newer. I believe BF16 is. Now that's been five years since that came out. And maybe there's AMD or Intel issues, but again, we're looking at 90 plus percent of the market here and out of that market share that's in this industry and in this, you know, deep here, I would bet there is a tiny minority that does not have BF16 support and it does serious damage to these AI models. And you not converting them in the intermediate process to BF16 and then quanting them is clipping them at a massive scale. Because you did not address the issue. And for you to close this issue a few hours later shows you don't want people to know. And I will reopen the issue, and I will make a new one. Because Bartowski converts to BF 16 before he quants, as does Mraderacher, I I probably just butchered his name, but you'll know who I mean. I just asked him last night and Bartowski already had confirmed it a while ago. So again, why are you doing this?

Sign up or log in to comment