Overview
This model was fine-tuned from unsloth/Qwen3.5-4B on the
Shumatsurontek/neo-sql-reasoning-combined dataset using Supervised Fine-Tuning (SFT)
with LoRA adapters (merged into base weights for easy deployment).
- Developed by: Shumatsurontek
- Fine-tuning pipeline: neo-deep-agent-lab
- Engine: Unsloth
- Compute: NVIDIA L40S on Modal serverless GPUs
- Precision: bf16
- License: Apache 2.0
Training Details
Configuration
| Parameter | Value |
|---|---|
| Base model | unsloth/Qwen3.5-4B |
| Method | SFT + LoRA (bf16, merged) |
| Learning rate | 2e-04 (cosine schedule, 5% warmup) |
| LoRA rank (r) | 16 |
| LoRA alpha | 32 (α/r = 32/16) |
| Batch size | 4 per GPU |
| Max sequence length | 2048 |
| Epochs | 2 |
Hyperparameters
Results
| Metric | Value |
|---|---|
| Final training loss | 0.5880 |
| Total optimization steps | 1,914 |
| Warmup | 5% (linear) |
| LR scheduler | cosine → 0 |
| Optimizer | AdamW 8-bit |
Dataset
Shumatsurontek/neo-sql-reasoning-combined
Each sample follows a 3-turn chat format:
System: You are a SQL expert. Given a database schema and a
natural language question, generate the correct SQL query.
User: Schema: CREATE TABLE orders (id INT, total DECIMAL);
Question: What is the total revenue?
Assistant: SELECT SUM(total) FROM orders;
Quickstart
Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("Shumatsurontek/Qwen3.5-4B-neo")
tokenizer = AutoTokenizer.from_pretrained("Shumatsurontek/Qwen3.5-4B-neo")
messages = [
{"role": "system", "content": "You are a SQL expert. Given a database schema and a natural language question, generate the correct SQL query."},
{"role": "user", "content": "Schema: CREATE TABLE orders (id INT, user_id INT, total DECIMAL);\nQuestion: Total revenue per user?"},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(output[0], skip_special_tokens=True))
vLLM (OpenAI-compatible server)
vllm serve Shumatsurontek/Qwen3.5-4B-neo --trust-remote-code
curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "Shumatsurontek/Qwen3.5-4B-neo",
"messages": [
{"role": "system", "content": "You are a SQL expert."},
{"role": "user", "content": "Schema: CREATE TABLE users (id INT, name TEXT);\nQuestion: List all users?"}
]
}'
Intended Use
This model is designed for text-to-SQL tasks: given a database schema and a natural-language question, it generates the corresponding SQL query. Best suited for analytical and read-only queries.
Out of scope: DDL/DML generation (CREATE, DROP, INSERT, UPDATE, DELETE), multi-database queries, or production use without human review of generated SQL.
Benchmark Results
Evaluated against baseline unsloth/Qwen3.5-4B using lm-eval-harness on NVIDIA L40S.
Evaluated on 50 samples per task.
| Benchmark | Baseline | Finetuned | Delta |
|---|---|---|---|
| MMLU: STEM | 73.5 | 71.8 | 🔴 -1.7 |
| MMLU: HUMANITIES | 76.5 | 73.4 | 🔴 -3.1 |
| HELLASWAG | 48.0 | 52.0 | 🟢 +4.0 |
| MMLU | 77.1 | 74.7 | 🔴 -2.4 |
| MMLU: OTHER | 77.7 | 75.8 | 🔴 -1.8 |
| ARC_CHALLENGE | 60.0 | 60.0 | ⚪ 0.0 |
| MMLU: SOCIAL SCIENCES | 83.0 | 79.7 | 🔴 -3.3 |
Citation
@misc{Qwen3_5_4B_neo,
title = {Qwen3.5-4B-neo},
author = {Shumatsurontek},
year = {2026},
url = {https://huggingface.co/Shumatsurontek/Qwen3.5-4B-neo}
}
License
Apache 2.0
- Downloads last month
- 826
Model tree for Shumatsurontek/Qwen3.5-4B-neo
Dataset used to train Shumatsurontek/Qwen3.5-4B-neo
Evaluation results
- Training Loss on Shumatsurontek/neo-sql-reasoning-combinedself-reported0.588