ecu-pilot-qwen3-8b-GGUF

A fine-tuned Qwen3-8B model for dbt project assistance via tool calling against dbt-index.

Overview

ecu-pilot is trained to autonomously select and call the correct dbt-index tool (out of 9 available tools) based on natural language questions about a dbt project, then summarize the results concisely.

Tools

Tool	Description
`status`	Project overview — model/test/source counts
`schema`	Database schema exploration
`search`	Full-text search across models, columns, descriptions
`node`	Detailed info on a specific model/source/test
`lineage`	Upstream/downstream dependency graph
`query`	Run SQL against the dbt-index DuckDB
`report`	Coverage reports (tests, docs, etc.)
`impact`	Blast radius analysis for model changes
`diff`	Compare project state across branches

Training

Base model: Qwen3-8B (via unsloth/Qwen3-8B-bnb-4bit)
Method: QLoRA (r=32, alpha=32) with Unsloth + TRL SFTTrainer
Data: 667 tool-calling conversation examples from 8 dbt projects (BIRD benchmark databases)
Epochs: 1
Hardware: AWS EC2 g6 (NVIDIA L40S 48GB)

Quantization

Format: GGUF Q4_K_M (4-bit quantized via llama.cpp)
Size: ~4.7 GB

Chat Template

This model uses a custom chat template for tool calling. The template is included as chat_template.jinja in this repo.

Key differences from stock Qwen3:

Tool definitions are injected into the system message within <tools> XML tags
Tool calls use <tool_call> / </tool_call> XML delimiters
Tool responses are wrapped in <tool_response> / </tool_response> and sent as user messages

When using with inference servers that support Jinja templates (vLLM, TGI, llama.cpp server), point to this template file.

Usage

LM Studio

Download the .gguf file
Place in ~/.lmstudio/models/json-l/ecu-pilot-qwen3-8b-GGUF/
Load in LM Studio, set the chat template to the contents of chat_template.jinja, and start the local server

Ollama

Create a Modelfile:

FROM ./ecu-pilot-qwen3-8b-q4km.gguf

TEMPLATE """{{- if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{- if .Tools }}<|im_start|>system
# Tools
You are provided with function signatures within <tools></tools> XML tags:
<tools>
{{- range .Tools }}
{{ json . }}
{{- end }}
</tools>
<|im_end|>
{{ end }}{{- range .Messages }}<|im_start|>{{ .Role }}
{{- if .Content }}
{{ .Content }}
{{- end }}{{- if .ToolCalls }}
<tool_call>
{{- range .ToolCalls }}
{"name": "{{ .Function.Name }}", "arguments": {{ json .Function.Arguments }}}
{{- end }}
</tool_call>
{{- end }}<|im_end|>
{{ end }}<|im_start|>assistant
"""

SYSTEM """You are an expert dbt project assistant with access to a dbt-index metadata server."""

PARAMETER stop "<|im_end|>"
PARAMETER stop "<tool_call>"
PARAMETER temperature 0.7
PARAMETER num_ctx 4096

Then:

ollama create ecu-pilot -f Modelfile
ollama run ecu-pilot

Evaluation

On 8 test prompts (2 runs each):

Metric	Score
Tool called	88%
Correct tool	75%
Answer generated	100%

License

Apache 2.0

Downloads last month: 536

GGUF

Model size

8B params

Architecture

qwen3

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Model tree for json-l/ecu-pilot-qwen3-8b-GGUF

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Quantized

(268)

this model