How do i deploy locally? Also is there any way to use on ollama?
#1
by dazz01 - opened
How do i deploy locally? Also is there any way to use on ollama?
How do i deploy locally? Also is there any way to use on ollama?
For local deployment:
Load the base model Qwen/Qwen3-0.6B-Base and apply this LoRA adapter using PeftModel.from_pretrained(). You can then run inference in Python with model.generate() or wrap it in a simple Gradio/FastAPI app for local serving.
For ollama:
Not directly. Ollama cannot load this repo because it is a LoRA adapter, not a full model. You must first merge the LoRA into the base model, convert the merged model to GGUF, and then create an Ollama Modelfile from that GGUF file.