Distil-Home-Assistant-Qwen3 (GGUF)
This is a GGUF version of distil-labs/distil-home-assistant-qwen3 for deployment with llama.cpp, Ollama, and other GGUF-compatible runtimes.
A fine-tuned Qwen3-0.6B model for multi-turn intent classification and slot extraction in an on-device smart home controller. Trained using knowledge distillation from a 120B teacher model, this 0.6B model delivers 96.7% tool call accuracy โ exceeding the teacher โ for private, low-latency smart home control.
Results
| Model | Parameters | Tool Call Accuracy | ROUGE |
|---|---|---|---|
| GPT-oss-120B (teacher) | 120B | 94.1% | 98.2% |
| This model (tuned) | 0.6B | 96.7% | 99.2% |
The fine-tuned model exceeds the 120B teacher on tool call accuracy while being 200x smaller.
Quick Start
Using llama.cpp
huggingface-cli download distil-labs/distil-home-assistant-qwen3-gguf \
--local-dir distil-model
llama-server \
--model distil-model/distil-home-assistant-qwen3.gguf \
--port 8000 \
--jinja
Then query via the OpenAI-compatible API:
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "model",
"messages": [
{"role": "system", "content": "You are a tool-calling model working on:\n<task_description>You are an on-device smart home controller. Given a natural language command from the user, call the appropriate smart home function. If the user does not specify a required value, omit that parameter from the function call. Maintain context across conversation turns to resolve pronouns and sequential commands.</task_description>\n\nRespond to the conversation history by generating an appropriate tool call that satisfies the user request. Generate only the tool call according to the provided tool schema, do not generate anything else. Always respond with a tool call.\n\n"},
{"role": "user", "content": "Turn off the living room lights"}
],
"tools": [
{"type": "function", "function": {"name": "toggle_lights", "description": "Turn lights on or off in a specified room", "parameters": {"type": "object", "properties": {"room": {"type": "string", "enum": ["living_room", "bedroom", "kitchen", "bathroom", "office", "hallway"]}, "state": {"type": "string", "enum": ["on", "off"]}}, "required": []}}}
],
"tool_choice": "required"
}'
Using Ollama
huggingface-cli download distil-labs/distil-home-assistant-qwen3 --local-dir distil-model
cd distil-model
ollama create distil-home-assistant -f Modelfile
ollama run distil-home-assistant
Using with the Demo App
This model powers the Smart Home Controller demo โ a text-based orchestrator that pairs an SLM with deterministic dialogue management for smart home control.
Model Details
| Property | Value |
|---|---|
| Base Model | Qwen/Qwen3-0.6B |
| Parameters | 0.6 billion |
| Format | GGUF (F16) |
| File Size | ~2.4 GB |
| Context Length | 40,960 tokens |
| Training Data | 50 seed conversations, synthetically expanded |
| Teacher Model | GPT-oss-120B |
| Task | Multi-turn tool calling (closed book) |
Training
This model was trained using the Distil Labs platform:
- Seed Data: 50 hand-written multi-turn conversations covering 6 smart home functions with 2-5 user turns per conversation
- Synthetic Expansion: Expanded to thousands of examples using a 120B teacher model
- Fine-tuning: Multi-turn tool calling distillation on Qwen3-0.6B
Supported Functions
The model handles 6 smart home operations:
| Function | Description |
|---|---|
toggle_lights |
Turn lights on or off in a room |
set_thermostat |
Set temperature and heating/cooling mode |
lock_door |
Lock or unlock a door |
get_device_status |
Query device state |
set_scene |
Activate a predefined scene |
intent_unclear |
Cannot determine intent |
Use Cases
- On-device smart home controllers with privacy-first design
- Text-based smart home chatbots with structured intent routing
- Edge deployment for local smart home hubs
- Any multi-turn tool calling task with bounded intent taxonomy
Limitations
- Trained on English smart home intents only
- Covers 6 specific smart home functions โ not a general-purpose tool caller
- 96.7% accuracy means a small fraction of function calls may be incorrect
- Temperature range is fixed to 60-80ยฐF
License
This model is released under the Apache 2.0 license.
Links
- Downloads last month
- 30
We're not able to determine the quantization variants.