qwen3-4b-lora-y-v42

English

LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using QLoRA (4-bit, Unsloth).

This repository contains adapter weights only. Please load the base model separately.

Training Objective

Improve structured-output accuracy for JSON / YAML / XML / TOML / CSV.

Loss is applied only to the final assistant response. Intermediate reasoning (Chain-of-Thought) is masked during training.

Training Configuration

Base model: Qwen/Qwen3-4B-Instruct-2507
Method: QLoRA (4-bit, Unsloth)
Dataset: merged_dataset_final_clean_v41.jsonl
Max sequence length: 512
Epochs: 2
Learning rate: 4.15e-06
Warmup ratio: 0.061847816577080154
Weight decay: 0.00023098994964083745
LoRA: r=128, alpha=128, dropout=0.0
Gradient accumulation: 4
Target modules: q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj
Seed: 3248

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "yamaTK/qwen3-4b-lora-y-v42"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.float16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)

日本語

Qwen/Qwen3-4B-Instruct-2507 をベースに、QLoRA（4-bit, Unsloth）で学習した LoRA アダプターです。

このリポジトリにはアダプター重みのみが含まれます。ベースモデルは別途読み込んでください。

学習目的

JSON / YAML / XML / TOML / CSV の構造化出力精度を高めることを目的に学習しています。

学習時の損失は最終アシスタント出力のみに適用し、中間推論（Chain-of-Thought）はマスクしています。

学習設定

ベースモデル: Qwen/Qwen3-4B-Instruct-2507
学習手法: QLoRA（4-bit, Unsloth）
学習データ: merged_dataset_final_clean_v41.jsonl
最大シーケンス長: 512
エポック数: 2
学習率: 4.15e-06
ウォームアップ比率: 0.061847816577080154
重み減衰: 0.00023098994964083745
LoRA: r=128, alpha=128, dropout=0.0
勾配累積: 4
対象モジュール: q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj
Seed: 3248

使い方

使い方は上記 Usage と同じです。

Downloads last month: 2

Model tree for yamaTK/qwen3-4b-lora-y-v42

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(5273)

this model