Support for quantizing whisper model?

#93
by lesserfield - opened

I was wondering if it would be possible to add support for Whisper.cpp? Thanks!

Hey! @lesserfield - That's a good question - I think we can perhaps pre-quantise all of the checkpoints and put them in the organisation somewhere. What do you think?

Hey! @lesserfield - That's a good question - I think we can perhaps pre-quantise all of the checkpoints and put them in the organisation somewhere. What do you think?

That would be helpful, I'm looking forward to it.

ggml-org org

FYI, whisper.cpp does not support gguf format atm, so maybe it requires more works.

Ref: https://github.com/ggerganov/whisper.cpp/blob/bf4cb4abad4e35c74b387df034cc4ac7b22e5fe6/whisper.cpp#L1332

Sorry, has been a busy couple of days, getting back to this now:

@ngxson - I agree and I think that's why it makes sense to have it seperately as a repo in the GGML org, my plan is to pre-quantize and upload all the major whisper.cpp quants:
https://github.com/ggerganov/whisper.cpp?tab=readme-ov-file#quantization

WDYT?

There are already quite a lot of pre-quantised quants here btw: https://huggingface.co/ggerganov/whisper.cpp/tree/main

lesserfield changed discussion status to closed

Sign up or log in to comment