Qwen3-4B Function Calling (Fine-Tuned) — Merged Model

This is the fully merged model (base weights + LoRA adapter combined) — a fine-tuned version of Qwen/Qwen3-4B-Instruct-2507 for function calling / tool use.

Fine-tuned by Prabhu Nithin Gollapudi on the xLAM Function Calling 60K dataset using QLoRA on an NVIDIA RTX 3060 Laptop GPU.

If you prefer loading a LoRA adapter on top of the original base model, see the standalone adapter: prabhu-nithin/qwen3-4b-xlam-function-calling-60k-lora

What this model does

The model decides when to call a tool vs. answer directly. When a tool call is needed, it outputs a structured JSON payload wrapped in <tool_call> tags. Given the tool result back as a <tool_response>, it then produces a natural language final answer.

User Query → Model → <tool_call>{"name": "fn", "arguments": {...}}</tool_call>
                         ↓
                   Execute Python function
                         ↓
                   <tool_response>{"result": ...}</tool_response>
                         ↓
                   Model → Final Answer

Quick start

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import json

model_id = "prabhu-nithin/qwen3-4b-xlam-function-calling-60k"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")

tool_definitions = [
    {
        "name": "get_weather",
        "description": "Get current weather information for a given city.",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "The city name"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "default": "celsius"}
            },
            "required": ["city"]
        }
    }
]

system_prompt = f"""You are a helpful assistant.

# Tools

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{json.dumps(tool_definitions, indent=2)}
</tools>

For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{{"name": "<function-name>", "arguments": <args-json-object>}}
</tool_call>"""

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "What's the weather like in Paris?"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, top_p=0.8, top_k=20)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

Training procedure

This model was trained with SFT (Supervised Fine-Tuning) using QLoRA (4-bit quantization + LoRA adapters). Training took approximately 18 hours on an NVIDIA RTX 3060 Laptop GPU.

Dataset

Source: Salesforce/xlam-function-calling-60k
Split: 95% train / 5% eval
Format: Converted to Qwen3's chat template with <tool_call> / <tool_response> tokens

Hyperparameters

Parameter	Value
Base model	Qwen/Qwen3-4B-Instruct-2507
Quantization	4-bit (QLoRA)
LoRA rank (`r`)	16
LoRA alpha	32
LoRA dropout	0.05
LoRA target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Epochs	3
Learning rate	2e-4
LR scheduler	cosine
Warmup ratio	0.05
Batch size (per device)	1
Gradient accumulation steps	8 (effective batch size = 8)
Weight decay	0.01
Optimizer	paged_adamw_8bit
Max sequence length	2048
Precision	bf16

Hardware

Setting	Value
GPU	NVIDIA RTX 3060 Laptop
VRAM	~8 GB used (QLoRA 4-bit)
Training time	~18 hours

Framework versions

TRL: 0.29.0
Transformers: 5.3.0
Pytorch: 2.5.1+cu121
Datasets: 4.6.1
Tokenizers: 0.22.2

License

Code: MIT License
Base Model: Apache 2.0
Dataset: CC-BY-4.0

Downloads last month: 3

Safetensors

Model size

4B params

Tensor type

F32

Model tree for prabhu-nithin/qwen3-4b-xlam-function-calling-60k

Base model

Qwen/Qwen3-4B-Instruct-2507

Quantized

(229)

this model