qwen3-4b-sft-lora-v5-2-20260208-1513
LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using QLoRA (4-bit, Unsloth).
LoRA adapter weights only - base model must be loaded separately.
Training Objective
Structured output accuracy (JSON / YAML / XML / TOML / CSV). Loss applied only to final assistant output (CoT masked).
Training Configuration
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Method: QLoRA (4-bit)
- Max sequence length: 512
- Epochs: 1
- Learning rate: 5e-06
- LoRA: r=64, alpha=128
- Weight decay: 0.05
- Warmup ratio: 0.1
- Effective batch size: 16
Usage (adapter_merge mode)
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "yusei926/qwen3-4b-sft-lora-v5-2-20260208-1513"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)
Sources & Terms
Training data: u-10bei/structured_data_with_cot_dataset_512_v5 Dataset License: MIT License.
- Downloads last month
- 220
Model tree for yusei926/qwen3-4b-sft-lora-v5-2-20260208-1513
Base model
Qwen/Qwen3-4B-Instruct-2507