Agent SFT Model (ALFWorld & DBBench)
This repository provides a LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using LoRA + Unsloth.
Training Objective
This adapter is trained to improve multi-turn agent task performance on:
- ALFWorld: Household interaction tasks.
- DBBench: Complex SQL database operations.
Training Configuration
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Max sequence length: 2048
- Epochs: 2
- Learning rate: 2e-06
- LoRA: r=64, alpha=128
Training Data
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = "Qwen/Qwen3-4B-Instruct-2507"
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(model, "your_id/your-repo")
- Downloads last month
- -
Model tree for nakotsuko13/qwen3-4b-nako13-agentbench-lora
Base model
Qwen/Qwen3-4B-Instruct-2507