Ming-flash-omni-2.0-GGUF or Support Llama.cpp ?

#10
by Rebis - opened

Hi,
Would it be possible to make a GGUF version or make LLama.cpp support this model for future quantization ?
Thank you in advance.

Thank you for your suggestion.
We will work on adding support for GGUF and llama.cpp. According to our roadmap, we will prioritize Int8/Int4 quantization combined with vLLM capabilities.
Please stay tuned.

Sign up or log in to comment