Convert checkpoint files to float16
#6
by mkardas - opened
No description provided.
mkardas changed pull request status to open
How can I implement this?
What are you trying to achieve?
The 1.3b model uses most of my 8gb of vram so large requests make it go over pretty quickly, I was hoping this would cut the memory use down.
You can load your model with:
model = OPTForCausalLM.from_pretrained(
"facebook/galactica-1.3b",
torch_dtype="float16",
device_map="auto"
)
mkardas changed pull request status to merged