Gguf version please

by ryg81 - opened 13 days ago

Discussion

ryg81

13 days ago

Please add gguf version like q4 or q5 for consumer gpus

kr-cen

Owner 13 days ago

Thanks a lot for the feedback and for trying out the model. I agree that the ~60GB VRAM peak is a significant bottleneck for consumer hardware. While GGUF quantization is technically feasible and highly effective for dropping VRAM, my current focus has been shifted to other projects, so I don't have the bandwidth to maintain GGUF quants right now.

Since the model weights are fully open source, I highly encourage and welcome anyone in the community to quantize it. If you or anyone else manages to create a GGUF version (e.g., using city96's conversion tools), please let me know, I would be more than happy to feature it and link to your repo in our page.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment