Quantized Models (GGUF, IQ, Imatrix)
Collection
Various GGUF quantizations of small models. Models with a "checkmark" are personal favorites. An "orange arrow" means it's being uploaded. β’ 97 items β’ Updated β’ 71
GGUF quants for Nitral-AI/CaptainErisNebula-12B-Chimera-v1.1's recipe.
Author recommended initial SillyTavern presets:
This is an improvement on the previous experimental version.
- Not "chaotic", and at a usable size for most people seeking to perform inference locally with good speeds.
- The model does not show excessive alignment, so it should be good for most scenarios/writing situations.
- Feel free to use some light system prompting to nudge it out of a blocker if needed.
- It does well in adhering to characters and instructions.
Thank you so much, "crazy chef" and "mad scientist", Nitral!
# Using the latest llama.cpp ...
release version at the time: b6258.
# Imatrix was based on the full ...
FP16 precision GGUF.
START: BF16 HuggingFace Model
β
(1) Conversion to Full-Precision GGUF
β
FP16 GGUF (for Calibration Imatrix)
BF16 GGUF (for Quantization)
β
(2) Generate Imatrix (from FP16 GGUF)
β
imatrix.fp16.gguf
β
(3) Quantize with Imatrix (using BF16 GGUF)
β
Final Quantized GGUF Models
β
END
Base model
Nitral-AI/CaptainErisNebula-12B-Chimera-v1.1