GGUF/llama.cpp support

by tcpmux - opened Mar 13

Discussion

tcpmux

Mar 13

Would be awesome!

Lockout

Mar 14

It may already be supported since it's just llama architecture. There are GGUF of the base model uploaded. As long as it doesn't mirror/echo from the instruction tuning should be a good one.

tcpmux

8 days ago

•

edited 8 days ago

"It may already be supported since it's just llama architecture"
Sadly it's not. It can be converted and quantized, but the corresponding file is not properly accepted by llama.cpp and instead crashes with errors when loading.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment