How do i deploy locally? Also is there any way to use on ollama?

#1
by dazz01 - opened

How do i deploy locally? Also is there any way to use on ollama?

How do i deploy locally? Also is there any way to use on ollama?

For local deployment:
Load the base model Qwen/Qwen3-0.6B-Base and apply this LoRA adapter using PeftModel.from_pretrained(). You can then run inference in Python with model.generate() or wrap it in a simple Gradio/FastAPI app for local serving.

For ollama:
Not directly. Ollama cannot load this repo because it is a LoRA adapter, not a full model. You must first merge the LoRA into the base model, convert the merged model to GGUF, and then create an Ollama Modelfile from that GGUF file.

Sign up or log in to comment