how?

#1
by servic - opened

how do i run this model and pair it with home assistant?

distil labs org

Hey! A few options for running this model:

Running the model locally

Since it's only 350M parameters, you can run it with most inference runtimes. The model card mentions support for Ollama, vLLM, llama.cpp, or anything that loads Safetensors. For example with Ollama you'd create a Modelfile pointing at the weights and then ollama run it. There's also an ONNX version of the base model if you want to run it on an NPU or embedded device.

Pairing with Home Assistant

Hey! Could you clarify which home assistant you're looking to pair it with - do you mean Home Assistant (the open-source platform), or something else like Alexa/Google Home? The setup path will differ quite a bit depending on the platform.

In the meantime, the model card's Deployment section covers the inference runtimes you can use to run the model locally (Ollama, vLLM, llama.cpp, ONNX, etc.).

More information

The blog post might have more details on the end-to-end setup.

Sign up or log in to comment