I can not run this model in colab, please help

#8
by Vuthuy6586 - opened

Thank you very much for this model.
I tried it in colab with GPU L4.
But when using your quick code in README.md; ( with already install transformers and trl via !pip install transformers==4.56.2 trl==0.22.2 bitsandbytes).
After 12 minutes output show Error OutOfMemory like image below. Pls help me what's wrong because I have already run OSS-20B susscessful on L4 GPU.
Thank you very much

image

Thanks for your feedback. Can you try to run model with offload option?

model = AutoModelForCausalLM.from_pretrained(
model_id,
quantization_config=quantization_config,
device_map="auto",
offload_folder="offload",
offload_state_dict=True
)

Thanks for your answer, but it still the same OutOfMemory Error, can you show me the instance that you have success deploy . This is my colab notebook code : https://colab.research.google.com/drive/1HrmnxU1BUddA5Z4rAdQ5-EKwsDSt2HZl?usp=sharing
I appreciate your help. thanks again

Thanks for your answer, but it still the same OutOfMemory Error, can you show me the instance that you have success deploy . This is my colab notebook code : https://colab.research.google.com/drive/1HrmnxU1BUddA5Z4rAdQ5-EKwsDSt2HZl?usp=sharing
I appreciate your help. thanks again

Thanks for feedback! I updated the repo and resolved the dependency issues. You can now use the base openai/gpt-oss-20b model and add the adapter layer from my repository on top. This way, the model should load and run without running into the OutOfMemory errors you experienced.

Thanks for helping me. Now I can run your model, and it work great. Thanks again!!!!

Vuthuy6586 changed discussion status to closed

Sign up or log in to comment