This is Eric Hartford's dolphin-2.2.1-mistral-7b, converted to GGUF. No other changes were made.
Two files are avaliable here:
- dolphin-2.2.1-mistral-7b-fp16.gguf: the original model converted to GGUF without quantization
- dolphin-2.2.1-mistral-7b-q8_0-LOT.gguf: the original model converted to GGUF with q8_0 quantization using the
--leave-output-tensorcommand-line option
From llama.cpp/quantize --help:
--leave-output-tensor: Will leave output.weight un(re)quantized. Increases model size but may also increase quality, especially when requantizing
The model was converted using convert.py from Georgi Gerganov's llama.cpp repo, commit #a6fc554.
All credit belongs to Eric Hartford for fine-tuning and releasing this model. Thank you!
- Downloads last month
- 16
Hardware compatibility
Log In to add your hardware
8-bit