Sarvam-30b-uncensored
This is the quantized version of the aoxo/sarvam-30b-uncensored model that is a finetune of the base model sarvamai/sarvam-30b.
Why use this?
The original finetune is in the bf16 format, which requires significantly stronger hardware than the average consumer-grade hardware. To solve this problem, I have GGUF quantized the finetuned uncensored model and now it can comfortably run on consumer hardware.
Quantization details
Original model size: ~60gb Quantization method: GGUF Q4_K_M, GGUF Q6_K Post-Quantization size: ~19gb, ~26gb
Tested on:
L4 gpu (24gb)
NOTE: I might release other quants like Q8_0 if there's demand.
- Downloads last month
- 812
Hardware compatibility
Log In to add your hardware
3-bit
4-bit
6-bit
Model tree for ThatCultivator/sarvam-30b-Uncensored-gguf
Base model
aoxo/sarvam-30b-uncensored