qwen3-4b-structured-sft-lora-v07-merged

Fully merged model (base + LoRA) fine-tuned from Qwen/Qwen3-4B-Instruct-2507.

v07 変更点

  • SFT LR: 2e-6 → 2e-5(v03比10倍。ガイド著者実績値)
  • 他はv03と同一(データセット・MASK_COT=1・LoRA r=64・Epoch=2)

Training Configuration

  • Base model: Qwen/Qwen3-4B-Instruct-2507
  • Method: QLoRA (4-bit) → merged
  • LR: 2e-05 / Epochs: 2 / LoRA: r=64, alpha=128
  • Dataset: u-10bei/structured_data_with_cot_dataset_512_v2(3933件)
  • MASK_COT: 1(CoT保持・lossマスク)
Downloads last month
15
Safetensors
Model size
4B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for deepkick/qwen3-4b-structured-sft-lora-v07-merged

Finetuned
(1536)
this model

Dataset used to train deepkick/qwen3-4b-structured-sft-lora-v07-merged