Qwen3-4B-DBT-Instruct — GGUF

A fine-tuned version of unsloth/Qwen3-4B-Instruct-2507-unsloth-bnb-4bit (the pre-quantized 4-bit Unsloth variant of Qwen3-4B-Instruct) specialized in converting natural language business questions into complete, multi-file dbt DAGs.

Looking for the LoRA adapter? → tdelard/Qwen3-4B-DBT-Instruct-LoRA

What does this model do?

Given a business question and a SQL schema, the model generates a full dbt project structure:

Staging layer — stg_*.sql files that clean and rename raw source data
YAML sources — _sources.yml / _stg_*.yml schema files with column definitions
Intermediate models — int_*.sql files that join and enrich staging data
Marts layer — fct_*.sql or dim_*.sql final business models

Example prompt:

Business question: Show the total revenue per product category, filtered to orders placed in the last 12 months.
SQL context: CREATE TABLE orders (...); CREATE TABLE products (...);

Example output: A ready-to-use dbt DAG with staging, intermediate, and mart SQL + YAML files.

Available GGUF files

File	Quantization	Size	Recommended for
`qwen3-4b-instruct-2507.Q4_K_M.gguf`	Q4_K_M	~2.5 GB	Most users (best size/quality trade-off)
`qwen3-4b-instruct-2507.Q5_K_M.gguf`	Q5_K_M	~2.9 GB	Higher quality, still fits in 8 GB RAM
`qwen3-4b-instruct-2507.Q8_0.gguf`	Q8_0	~4.3 GB	Maximum quality, requires ~6 GB RAM

Usage

LM Studio / Jan

Search for tdelard/Qwen3-4b-DBT-Instruct-GGUF in the model browser, or download the GGUF manually.
Load the model and use the system prompt below.

Ollama

An Modelfile is included in this repository for easy import:

ollama create qwen3-dbt -f Modelfile
ollama run qwen3-dbt

llama.cpp

llama-cli -hf tdelard/Qwen3-4b-DBT-Instruct-GGUF --jinja \
  -m qwen3-4b-instruct-2507.Q4_K_M.gguf

Python (llama-cpp-python)

from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="tdelard/Qwen3-4b-DBT-Instruct-GGUF",
    filename="qwen3-4b-instruct-2507.Q4_K_M.gguf",
    n_ctx=2048,
)

response = llm.create_chat_completion(
    messages=[
        {
            "role": "system",
            "content": (
                "You are a dbt expert. Given a business question and a SQL schema, "
                "generate a complete, production-ready dbt DAG including staging SQL files, "
                "YAML schema files, intermediate models, and mart models. "
                "Use proper dbt conventions: ref(), source(), naming prefixes (stg_, int_, fct_, dim_)."
            ),
        },
        {
            "role": "user",
            "content": (
                "Business question: Show the total revenue per product category.\n"
                "SQL context: CREATE TABLE orders (order_id INT, product_id INT, amount DECIMAL); "
                "CREATE TABLE products (product_id INT, category VARCHAR, name VARCHAR);"
            ),
        },
    ]
)
print(response["choices"][0]["message"]["content"])

Training details

Parameter	Value
Base model	`unsloth/Qwen3-4B-Instruct-2507-unsloth-bnb-4bit`
Training framework	Unsloth + TRL SFTTrainer
Method	QLoRA (4-bit quantized base + LoRA adapters)
LoRA rank	32
LoRA alpha	32
Learning rate	2e-4
Max sequence length	2048 tokens
Hardware	Google Colab T4 GPU (15 GB VRAM)
Training dataset	tdelard/text_to_dbt
Train split	900 examples
Eval split	100 examples

Training dataset pipeline

The training data was built from scratch using a synthetic generation pipeline:

Source: ~1 000 SQL queries sampled from b-mc2/sql-create-context, filtered and scored on 24 structural complexity features (table count, join depth, aggregation, subqueries…).
Generation: Each SQL query was transformed into a multi-file dbt DAG by Claude Sonnet via structured prompting.
Validation: Every generated DAG was validated with dbt parse (no database required), catching ref/source resolution errors and YAML issues. Only passing DAGs were kept.

Limitations

Context window is 2 048 tokens — very large schemas or highly complex queries may be truncated.
The model was trained on single-question → single-DAG examples; multi-model or incremental dbt patterns are not covered.
Output quality degrades on schemas with many tables (> 8–10); use the intermediate layer to break complexity.

License

Apache 2.0 — same as the base Qwen3-4B model.

Fine-tuned and converted to GGUF using Unsloth.

Downloads last month: 73

GGUF

Model size

4B params

Architecture

qwen3

Hardware compatibility

4-bit

5-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tdelard/Qwen3-4B-DBT-Instruct-GGUF

Base model

Qwen/Qwen3-4B-Instruct-2507

Quantized

unsloth/Qwen3-4B-Instruct-2507-unsloth-bnb-4bit

Quantized

(6)

this model