Distil-Home-Assistant-Qwen3 (GGUF)

This is a GGUF version of distil-labs/distil-home-assistant-qwen3 for deployment with llama.cpp, Ollama, and other GGUF-compatible runtimes.

A fine-tuned Qwen3-0.6B model for multi-turn intent classification and slot extraction in an on-device smart home controller. Trained using knowledge distillation from a 120B teacher model, this 0.6B model delivers 96.7% tool call accuracy — exceeding the teacher — for private, low-latency smart home control.

Results

Model	Parameters	Tool Call Accuracy	ROUGE
GPT-oss-120B (teacher)	120B	94.1%	98.2%
This model (tuned)	0.6B	96.7%	99.2%

The fine-tuned model exceeds the 120B teacher on tool call accuracy while being 200x smaller.

Quick Start

Using llama.cpp

huggingface-cli download distil-labs/distil-home-assistant-qwen3-gguf \
    --local-dir distil-model

llama-server \
    --model distil-model/distil-home-assistant-qwen3.gguf \
    --port 8000 \
    --jinja

Then query via the OpenAI-compatible API:

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "model",
    "messages": [
      {"role": "system", "content": "You are a tool-calling model working on:\n<task_description>You are an on-device smart home controller. Given a natural language command from the user, call the appropriate smart home function. If the user does not specify a required value, omit that parameter from the function call. Maintain context across conversation turns to resolve pronouns and sequential commands.</task_description>\n\nRespond to the conversation history by generating an appropriate tool call that satisfies the user request. Generate only the tool call according to the provided tool schema, do not generate anything else. Always respond with a tool call.\n\n"},
      {"role": "user", "content": "Turn off the living room lights"}
    ],
    "tools": [
      {"type": "function", "function": {"name": "toggle_lights", "description": "Turn lights on or off in a specified room", "parameters": {"type": "object", "properties": {"room": {"type": "string", "enum": ["living_room", "bedroom", "kitchen", "bathroom", "office", "hallway"]}, "state": {"type": "string", "enum": ["on", "off"]}}, "required": []}}}
    ],
    "tool_choice": "required"
  }'

Using Ollama

huggingface-cli download distil-labs/distil-home-assistant-qwen3 --local-dir distil-model
cd distil-model
ollama create distil-home-assistant -f Modelfile
ollama run distil-home-assistant

Using with the Demo App

This model powers the Smart Home Controller demo — a text-based orchestrator that pairs an SLM with deterministic dialogue management for smart home control.

Model Details

Property	Value
Base Model	Qwen/Qwen3-0.6B
Parameters	0.6 billion
Format	GGUF (F16)
File Size	~2.4 GB
Context Length	40,960 tokens
Training Data	50 seed conversations, synthetically expanded
Teacher Model	GPT-oss-120B
Task	Multi-turn tool calling (closed book)

Training

This model was trained using the Distil Labs platform:

Seed Data: 50 hand-written multi-turn conversations covering 6 smart home functions with 2-5 user turns per conversation
Synthetic Expansion: Expanded to thousands of examples using a 120B teacher model
Fine-tuning: Multi-turn tool calling distillation on Qwen3-0.6B

Supported Functions

The model handles 6 smart home operations:

Function	Description
`toggle_lights`	Turn lights on or off in a room
`set_thermostat`	Set temperature and heating/cooling mode
`lock_door`	Lock or unlock a door
`get_device_status`	Query device state
`set_scene`	Activate a predefined scene
`intent_unclear`	Cannot determine intent

Use Cases

On-device smart home controllers with privacy-first design
Text-based smart home chatbots with structured intent routing
Edge deployment for local smart home hubs
Any multi-turn tool calling task with bounded intent taxonomy

Limitations

Trained on English smart home intents only
Covers 6 specific smart home functions — not a general-purpose tool caller
96.7% accuracy means a small fraction of function calls may be incorrect
Temperature range is fixed to 60-80°F

License

This model is released under the Apache 2.0 license.

Model tree for distil-labs/distil-home-assistant-qwen3-gguf

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B

Quantized

(289)

this model

distil-labs
/

distil-home-assistant-qwen3-gguf