inference
#1
pinned
by CemalSahin - opened
Thanks for quantizing this model!
Made a simple script to use these GGUF models easily: https://github.com/cmlshn/PromptEnhancer-GGUF
Just run python inference/prompt_enhancer_gguf.py. Works great on H100, getting ~54 tok/s
mradermacher pinned discussion