Qwen3-4B Structured Transformation LoRA by Hirojie5310 (v1)

This repository provides a LoRA adapter fine-tuned from
Qwen3-4B-Instruct-2507 using QLoRA (4-bit) with Unsloth.

This adapter is specialized for structured data transformation tasks
(JSON / YAML / XML / CSV), with strict output formatting.

This repository contains LoRA adapter weights only.
The base model must be loaded separately.

Training Objective

This LoRA adapter is trained to improve structured output accuracy
for deterministic format conversion tasks such as:

  • JSON โ†” YAML
  • JSON โ†” XML
  • CSV โ†’ XML / YAML
  • Schema-constrained structured transformations

Training is performed with assistant-only loss:

  • Loss is applied only after explicit output markers (e.g. Final: or Output:).
  • Intermediate reasoning (Chain-of-Thought) is fully masked and not learned.

This design ensures:

  • No leakage of reasoning text
  • Clean, strictly formatted final outputs
  • Robust generalization to unseen structured schemas

Training Configuration

  • Base model: Qwen3-4B-Instruct-2507
  • Fine-tuning method: QLoRA (4-bit) with Unsloth
  • Max sequence length: 512
  • Epochs: 1
  • Learning rate: 1e-6
  • Optimizer: AdamW
  • LoRA configuration:
    • Rank (r): 64
    • Alpha: 128
    • Dropout: 0

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "Hirojie5310/your-lora-repo-name"

tokenizer = AutoTokenizer.from_pretrained(base_model)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.float16,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, adapter)

Sources & Terms (IMPORTANT)

Training data is a mixed dataset constructed from the following sources:

  • u-10bei/structured_data_with_cot_dataset_512_v5 (MIT License)
  • daichira/structured-5k-mix-sft
  • daichira/structured-hard-sft-4k

These datasets are used in compliance with their respective licenses and terms of use.

This repository distributes LoRA adapter weights only. Users must comply with:

  • Each dataset's original license
  • The base model's original terms of use (Qwen/Qwen3-4B-Instruct-2507)
Downloads last month
7
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Hirojie5310/your-lora-repo

Adapter
(5271)
this model