Distil-Home-Assistant-Qwen3 (GGUF)

This is a GGUF version of distil-labs/distil-home-assistant-qwen3 for deployment with llama.cpp, Ollama, and other GGUF-compatible runtimes.

A fine-tuned Qwen3-0.6B model for multi-turn intent classification and slot extraction in an on-device smart home controller. Trained using knowledge distillation from a 120B teacher model, this 0.6B model delivers 96.7% tool call accuracy โ€” exceeding the teacher โ€” for private, low-latency smart home control.

Results

Model Parameters Tool Call Accuracy ROUGE
GPT-oss-120B (teacher) 120B 94.1% 98.2%
This model (tuned) 0.6B 96.7% 99.2%

The fine-tuned model exceeds the 120B teacher on tool call accuracy while being 200x smaller.

Quick Start

Using llama.cpp

huggingface-cli download distil-labs/distil-home-assistant-qwen3-gguf \
    --local-dir distil-model

llama-server \
    --model distil-model/distil-home-assistant-qwen3.gguf \
    --port 8000 \
    --jinja

Then query via the OpenAI-compatible API:

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "model",
    "messages": [
      {"role": "system", "content": "You are a tool-calling model working on:\n<task_description>You are an on-device smart home controller. Given a natural language command from the user, call the appropriate smart home function. If the user does not specify a required value, omit that parameter from the function call. Maintain context across conversation turns to resolve pronouns and sequential commands.</task_description>\n\nRespond to the conversation history by generating an appropriate tool call that satisfies the user request. Generate only the tool call according to the provided tool schema, do not generate anything else. Always respond with a tool call.\n\n"},
      {"role": "user", "content": "Turn off the living room lights"}
    ],
    "tools": [
      {"type": "function", "function": {"name": "toggle_lights", "description": "Turn lights on or off in a specified room", "parameters": {"type": "object", "properties": {"room": {"type": "string", "enum": ["living_room", "bedroom", "kitchen", "bathroom", "office", "hallway"]}, "state": {"type": "string", "enum": ["on", "off"]}}, "required": []}}}
    ],
    "tool_choice": "required"
  }'

Using Ollama

huggingface-cli download distil-labs/distil-home-assistant-qwen3 --local-dir distil-model
cd distil-model
ollama create distil-home-assistant -f Modelfile
ollama run distil-home-assistant

Using with the Demo App

This model powers the Smart Home Controller demo โ€” a text-based orchestrator that pairs an SLM with deterministic dialogue management for smart home control.

Model Details

Property Value
Base Model Qwen/Qwen3-0.6B
Parameters 0.6 billion
Format GGUF (F16)
File Size ~2.4 GB
Context Length 40,960 tokens
Training Data 50 seed conversations, synthetically expanded
Teacher Model GPT-oss-120B
Task Multi-turn tool calling (closed book)

Training

This model was trained using the Distil Labs platform:

  1. Seed Data: 50 hand-written multi-turn conversations covering 6 smart home functions with 2-5 user turns per conversation
  2. Synthetic Expansion: Expanded to thousands of examples using a 120B teacher model
  3. Fine-tuning: Multi-turn tool calling distillation on Qwen3-0.6B

Supported Functions

The model handles 6 smart home operations:

Function Description
toggle_lights Turn lights on or off in a room
set_thermostat Set temperature and heating/cooling mode
lock_door Lock or unlock a door
get_device_status Query device state
set_scene Activate a predefined scene
intent_unclear Cannot determine intent

Use Cases

  • On-device smart home controllers with privacy-first design
  • Text-based smart home chatbots with structured intent routing
  • Edge deployment for local smart home hubs
  • Any multi-turn tool calling task with bounded intent taxonomy

Limitations

  • Trained on English smart home intents only
  • Covers 6 specific smart home functions โ€” not a general-purpose tool caller
  • 96.7% accuracy means a small fraction of function calls may be incorrect
  • Temperature range is fixed to 60-80ยฐF

License

This model is released under the Apache 2.0 license.

Links

Downloads last month
30
GGUF
Model size
0.6B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for distil-labs/distil-home-assistant-qwen3-gguf

Finetuned
Qwen/Qwen3-0.6B
Quantized
(289)
this model