yusei926
/

qwen3-4b-sft-lora-v7-20260208-1749

Text Generation

structured-output

Model card Files Files and versions

qwen3-4b-sft-lora-v7-20260208-1749

LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using QLoRA (4-bit, Unsloth).

LoRA adapter weights only - base model must be loaded separately.

Training Objective

Structured output accuracy (JSON / YAML / XML / TOML / CSV). Loss applied only to final assistant output (CoT masked).

Training Configuration

Base model: Qwen/Qwen3-4B-Instruct-2507
Method: QLoRA (4-bit)
Max sequence length: 512
Epochs: 1
Learning rate: 7e-06
LoRA: r=64, alpha=128
Weight decay: 0.05
Warmup ratio: 0.1
Effective batch size: 16
NEFTune noise alpha: 5.0

Usage (adapter_merge mode)

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "yusei926/qwen3-4b-sft-lora-v7-20260208-1749"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)

Sources & Terms

Training data: u-10bei/structured_data_with_cot_dataset_512_v5 Dataset License: MIT License.

Downloads last month: 213

Model tree for yusei926/qwen3-4b-sft-lora-v7-20260208-1749

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(5273)

this model

Dataset used to train yusei926/qwen3-4b-sft-lora-v7-20260208-1749