8bit version of the model
#8
by varun500 - opened
No description provided.
varun500 changed pull request title from *bit version of the model to 8bit version of the model
A 8bit version of the model would be helpful which can be loaded in 16GB of GPU VRAM
- This is a 4bit GPTQ model. I could make an 8bit GPTQ but there's no point because we can already load HF models in 8bit using
bitsandbytes - If you want 8bit, please use https://huggingface.co/TheBloke/stable-vicuna-13B-HF and specify
load_in_8bit=Truelike I told you on Github
TheBloke changed pull request status to closed
Sure will do that