Ming-flash-omni-2.0-GGUF or Support Llama.cpp ?
#10
by Rebis - opened
Hi,
Would it be possible to make a GGUF version or make LLama.cpp support this model for future quantization ?
Thank you in advance.
Thank you for your suggestion.
We will work on adding support for GGUF and llama.cpp. According to our roadmap, we will prioritize Int8/Int4 quantization combined with vLLM capabilities.
Please stay tuned.