UniFlow Qwen2.5-3B v3 (4-bit NF4 QLoRA)

Fine-tuned LoRA adapter for conversation summarization in RAG meeting assistant systems.

Model Details

Parameter	Value
Base Model	Qwen/Qwen2.5-3B-Instruct
Quantization	4-bit NF4 (QLoRA)
LoRA Rank	16
LoRA Alpha	32
Target Modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Dataset	SAMSum (16,369) + DialogSum (13,460) = 28,982 examples
Training Epochs	3
Learning Rate	8e-5
Batch Size	8 (effective 16 with gradient accumulation)
GPU Memory	4.5 GB

Paper

Resource-Efficient Multi-Source RAG: Hexagonal Architecture and the 4-bit Quantization Paradox

Authors: Tunahan Buyukgebiz, Toprak Necat Gok, Suleyman Muhammed Arikan, Ibrahim Alper Dogru
Institution: Gazi University, Department of Computer Engineering
Target: Automated Software Engineering (ASE) Journal

Usage

from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-3B-Instruct")
model = PeftModel.from_pretrained(base_model, "tnecat/uniflow-qwen25-3b-v3")
tokenizer = AutoTokenizer.from_pretrained("tnecat/uniflow-qwen25-3b-v3")

Replication

Full source code and evaluation scripts: uniflow-system/ase-replication

Downloads last month: 13

Model tree for tnecat/uniflow-qwen25-3b-v3

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Adapter

(1133)

this model

tnecat
/

uniflow-qwen25-3b-v3