qwen3-4b-sft-lora-v7-20260208-1749

LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using QLoRA (4-bit, Unsloth).

LoRA adapter weights only - base model must be loaded separately.

Training Objective

Structured output accuracy (JSON / YAML / XML / TOML / CSV). Loss applied only to final assistant output (CoT masked).

Training Configuration

  • Base model: Qwen/Qwen3-4B-Instruct-2507
  • Method: QLoRA (4-bit)
  • Max sequence length: 512
  • Epochs: 1
  • Learning rate: 7e-06
  • LoRA: r=64, alpha=128
  • Weight decay: 0.05
  • Warmup ratio: 0.1
  • Effective batch size: 16
  • NEFTune noise alpha: 5.0

Usage (adapter_merge mode)

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "yusei926/qwen3-4b-sft-lora-v7-20260208-1749"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)

Sources & Terms

Training data: u-10bei/structured_data_with_cot_dataset_512_v5 Dataset License: MIT License.

Downloads last month
213
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yusei926/qwen3-4b-sft-lora-v7-20260208-1749

Adapter
(5273)
this model

Dataset used to train yusei926/qwen3-4b-sft-lora-v7-20260208-1749