You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

vidavox/Qwen2.5-SKK-7B-SFT-LoRA

LoRA adapter for a Qwen2.5-7B Instruct base model, fine-tuned on SKK’s KSMI document data for domain-specific question answering.

This repository contains only the LoRA adapter weights. To use the model, you must load a compatible Qwen2.5-7B Instruct base model and then attach this adapter with PeftModel.from_pretrained.


Model Details

  • Base model (expected): Qwen/Qwen2.5-7B-Instruct (or another compatible Qwen2.5-7B Instruct variant)
  • Adapter type: LoRA via PEFT
  • Task: Supervised fine-tuning for assistant-style answers grounded in SKK’s KSMI document data.
  • Languages: Primarily Bahasa Indonesia and English in a technical / regulatory context.

This adapter is intended to be used in SKK’s internal systems for answering questions based on KSMI and related upstream oil & gas regulatory documents.


Usage (PEFT / LoRA)

⚠️ This repo does not include the base model. You must:

  1. Load the Qwen2.5-7B Instruct base model.
  2. Wrap it with PeftModel.from_pretrained using this adapter.

1. Install dependencies

pip install "transformers>=4.43.0" peft accelerate bitsandbytes

2. Load the base Qwen2.5-7B Instruct model

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

BASE_MODEL_ID = "Qwen/Qwen2.5-7B-Instruct"  # adjust if you used a different Qwen2.5-7B Instruct ID

tokenizer = AutoTokenizer.from_pretrained(
    BASE_MODEL_ID,
    trust_remote_code=True,
)

base_model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL_ID,
    device_map="auto",
    torch_dtype=torch.bfloat16,   # or "auto"
    trust_remote_code=True,
)

3. Attach the LoRA adapter with PeftModel

from peft import PeftModel

ADAPTER_ID = "vidavox/Qwen2.5-SKK-7B-SFT-LoRA"

model = PeftModel.from_pretrained(
    base_model,
    ADAPTER_ID,
    torch_dtype=torch.bfloat16,   # should match the base model dtype
)
model.eval()

PeftModel.from_pretrained loads the adapter configuration and weights and applies them to the provided base model.

4. Chat-style inference (Qwen2.5 chat template)

Qwen models use a chat template accessed via tokenizer.apply_chat_template.

messages = [
    {
        "role": "system",
        "content": "You are an assistant specialized in SKK KSMI documents.",
    },
    {
        "role": "user",
        "content": "Jelaskan secara ringkas tahapan proses persetujuan POD berdasarkan KSMI.",
    },
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize = False,
    add_generation_prompt=True,
)

model_inputs = tokenizer([input_text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=max_new_tokens,
    do_sample=False,
    temperature=0.1,
    top_p=0.95,
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 
response = tokenizer.decode(output_ids, skip_special_tokens=True)
print(response)

5. Optional: 4-bit loading for constrained VRAM

To run base + adapter on a single 24 GB GPU (e.g. RTX 3090), you can use 4-bit quantization with bitsandbytes:

from transformers import BitsAndBytesConfig, AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

BASE_MODEL_ID = "Qwen/Qwen2.5-7B-Instruct"
ADAPTER_ID = "vidavox/Qwen2.5-SKK-7B-SFT-LoRA"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
)

tokenizer = AutoTokenizer.from_pretrained(
    BASE_MODEL_ID,
    trust_remote_code=True,
)

base_model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL_ID,
    device_map="auto",
    quantization_config=bnb_config,
    trust_remote_code=True,
)

model = PeftModel.from_pretrained(base_model, ADAPTER_ID)
model.eval()

The generation code is then the same as in step 4.


Training Data

The adapter was trained on SKK’s internal KSMI document data, formatted as instruction-style examples.

  • Train split: 2223 examples
  • Validation split: 247 examples
  • Test split: 50 examples

The data consists of questions and instructions grounded in KSMI and related SKK upstream oil & gas regulations, with answers written to be faithful to the underlying documents. The dataset is private and not released with this model.


Evaluation (SDA on test data)

Evaluation was performed on the 50-sample test set using an SDA-style pipeline that combines automatic metrics (text/semantic similarity) and human-oriented quality scores.

Metric summary

Metric Mean (test) Scale / note
BERTScore F1 0.840 0–1, higher = better semantic similarity
Correctness 4.88 1–10, higher = logically correct answers
Completeness 4.14 1–10, higher = more required information covered
Factuality 6.28 1–10, higher = fewer factual errors
Structure 7.00 1–10, higher = better organization / formatting
Hallucination resistance 6.34 1–10, higher = less hallucination

These metrics are computed on a small, domain-specific test set and should be interpreted as indicative of quality on KSMI-style questions, not as general-purpose benchmarks.


Intended Use (High-Level)

  • Primary use: Internal SKK assistant systems answering questions grounded in KSMI and related upstream O&G regulations.
  • Not intended for: General-purpose open-domain chat, safety-critical decision making, or use outside the domain without additional evaluation and alignment.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for vidavox/Qwen2.5-SKK-7B-SFT-LoRA

Base model

Qwen/Qwen2.5-7B
Adapter
(1807)
this model