ecu-pilot-qwen3-8b-GGUF

A fine-tuned Qwen3-8B model for dbt project assistance via tool calling against dbt-index.

Overview

ecu-pilot is trained to autonomously select and call the correct dbt-index tool (out of 9 available tools) based on natural language questions about a dbt project, then summarize the results concisely.

Tools

Tool Description
status Project overview โ€” model/test/source counts
schema Database schema exploration
search Full-text search across models, columns, descriptions
node Detailed info on a specific model/source/test
lineage Upstream/downstream dependency graph
query Run SQL against the dbt-index DuckDB
report Coverage reports (tests, docs, etc.)
impact Blast radius analysis for model changes
diff Compare project state across branches

Training

  • Base model: Qwen3-8B (via unsloth/Qwen3-8B-bnb-4bit)
  • Method: QLoRA (r=32, alpha=32) with Unsloth + TRL SFTTrainer
  • Data: 667 tool-calling conversation examples from 8 dbt projects (BIRD benchmark databases)
  • Epochs: 1
  • Hardware: AWS EC2 g6 (NVIDIA L40S 48GB)

Quantization

  • Format: GGUF Q4_K_M (4-bit quantized via llama.cpp)
  • Size: ~4.7 GB

Chat Template

This model uses a custom chat template for tool calling. The template is included as chat_template.jinja in this repo.

Key differences from stock Qwen3:

  • Tool definitions are injected into the system message within <tools> XML tags
  • Tool calls use <tool_call> / </tool_call> XML delimiters
  • Tool responses are wrapped in <tool_response> / </tool_response> and sent as user messages

When using with inference servers that support Jinja templates (vLLM, TGI, llama.cpp server), point to this template file.

Usage

LM Studio

  1. Download the .gguf file
  2. Place in ~/.lmstudio/models/json-l/ecu-pilot-qwen3-8b-GGUF/
  3. Load in LM Studio, set the chat template to the contents of chat_template.jinja, and start the local server

Ollama

Create a Modelfile:

FROM ./ecu-pilot-qwen3-8b-q4km.gguf

TEMPLATE """{{- if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{- if .Tools }}<|im_start|>system
# Tools
You are provided with function signatures within <tools></tools> XML tags:
<tools>
{{- range .Tools }}
{{ json . }}
{{- end }}
</tools>
<|im_end|>
{{ end }}{{- range .Messages }}<|im_start|>{{ .Role }}
{{- if .Content }}
{{ .Content }}
{{- end }}{{- if .ToolCalls }}
<tool_call>
{{- range .ToolCalls }}
{"name": "{{ .Function.Name }}", "arguments": {{ json .Function.Arguments }}}
{{- end }}
</tool_call>
{{- end }}<|im_end|>
{{ end }}<|im_start|>assistant
"""

SYSTEM """You are an expert dbt project assistant with access to a dbt-index metadata server."""

PARAMETER stop "<|im_end|>"
PARAMETER stop "<tool_call>"
PARAMETER temperature 0.7
PARAMETER num_ctx 4096

Then:

ollama create ecu-pilot -f Modelfile
ollama run ecu-pilot

Evaluation

On 8 test prompts (2 runs each):

Metric Score
Tool called 88%
Correct tool 75%
Answer generated 100%

License

Apache 2.0

Downloads last month
536
GGUF
Model size
8B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for json-l/ecu-pilot-qwen3-8b-GGUF

Finetuned
Qwen/Qwen3-8B
Quantized
(268)
this model