Qwen3-4B Structured Transformation LoRA by Hirojie5310 (v1)

This repository provides a LoRA adapter fine-tuned from
Qwen3-4B-Instruct-2507 using QLoRA (4-bit) with Unsloth.

This adapter is specialized for structured data transformation tasks
(JSON / YAML / XML / CSV), with strict output formatting.

This repository contains LoRA adapter weights only.
The base model must be loaded separately.

Training Objective

This LoRA adapter is trained to improve structured output accuracy
for deterministic format conversion tasks such as:

JSON ↔ YAML
JSON ↔ XML
CSV → XML / YAML
Schema-constrained structured transformations

Training is performed with assistant-only loss:

Loss is applied only after explicit output markers (e.g. Final: or Output:).
Intermediate reasoning (Chain-of-Thought) is fully masked and not learned.

This design ensures:

No leakage of reasoning text
Clean, strictly formatted final outputs
Robust generalization to unseen structured schemas

Training Configuration

Base model: Qwen3-4B-Instruct-2507
Fine-tuning method: QLoRA (4-bit) with Unsloth
Max sequence length: 512
Epochs: 1
Learning rate: 1e-6
Optimizer: AdamW
LoRA configuration:
- Rank (r): 64
- Alpha: 128
- Dropout: 0

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "Hirojie5310/your-lora-repo-name"

tokenizer = AutoTokenizer.from_pretrained(base_model)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.float16,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, adapter)

Sources & Terms (IMPORTANT)

Training data is a mixed dataset constructed from the following sources:

u-10bei/structured_data_with_cot_dataset_512_v5 (MIT License)
daichira/structured-5k-mix-sft
daichira/structured-hard-sft-4k

These datasets are used in compliance with their respective licenses and terms of use.

This repository distributes LoRA adapter weights only. Users must comply with:

Each dataset's original license
The base model's original terms of use (Qwen/Qwen3-4B-Instruct-2507)

Downloads last month: 7

Model tree for Hirojie5310/your-lora-repo

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(5271)

this model