gguf

Feb 16

would help distribution to make a gguf version as well. I tried with the gguf my repo tool but it throws an exception. thought it would just work as it's a llama.

martinsu

Owner Feb 23

•

edited Feb 23

Since this fine tune is classic one - no new added layers etc, its 1:1 original(except weights and tokenizer content + embeddings), i need to look how unsloth did it, or maybe we can ask them politely to convert this fine tune. Also i have new RAG finetuned version of this model, ill upload when ill have spare time, its way better at following instructions.

maigonis

Mar 6

I doubt they will add this to their catalog, but can assist whit converting.

Im still waiting for something smaller, running 30B dense model on consumer hardware is not a easy task, MoE sure.

martinsu

Owner Mar 7

I run this model in workflow/agentic environment where humans are waiting for output on VLLM 4xA40 GPU node. So thats not a consumer hardware.

Yes, pushing consumer hardware limits on these models are quite an adventure alas.

gieza

3 days ago

I have found that conversion to gguf when using llamma.cpp convert_hf_to_gguf.py is failing due to unrecognized 'no' in Readme.md

language: [en, de, fr, es, it, pt, nl, pl, lv, et, lt, cs, sk, ro, bg, sl, hr, sv, da, fi, hu, uk, ru, zh, hi, ja, ko, el, no]

Probably should be:

language: [en, de, fr, es, it, pt, nl, pl, lv, et, lt, cs, sk, ro, bg, sl, hr, sv, da, fi, hu, uk, ru, zh, hi, ja, ko, el, nb]

martinsu

Owner 3 days ago

I have found that conversion to gguf when using llamma.cpp convert_hf_to_gguf.py is failing due to unrecognized 'no' in Readme.md
language: [en, de, fr, es, it, pt, nl, pl, lv, et, lt, cs, sk, ro, bg, sl, hr, sv, da, fi, hu, uk, ru, zh, hi, ja, ko, el, no]
Probably should be:
language: [en, de, fr, es, it, pt, nl, pl, lv, et, lt, cs, sk, ro, bg, sl, hr, sv, da, fi, hu, uk, ru, zh, hi, ja, ko, el, nb]

Thank You! Easy fix!

KnutJaegersberg

1 day ago

Cool. That made it work I guess now downloading a guff

https://huggingface.co/KnutJaegersberg/tildeopen-30b-mu-instruct-Q8_0-GGUF
https://huggingface.co/KnutJaegersberg/tildeopen-30b-mu-instruct-Q4_K_M-GGUF

KnutJaegersberg changed discussion status to closed 1 day ago

martinsu

Owner about 7 hours ago

•

edited about 7 hours ago

I will perhaps check out how tokenizer behaves in this conversion - it uses slow one or something else, since even if its broken, it sort of predicts somewhat plausible, but degraded output tokens.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment