qlora_mistral_y02_V2 β€” Y02 Green Patent Classifier

This is a QLoRA fine-tuned adapter for mistralai/Mistral-7B-Instruct-v0.2, trained to classify patent claims as GREEN (Y02) or NOT GREEN.

It was developed as the Judge agent's brain in a 3-agent MAS pipeline for Y02 green patent classification (M4 Final Assignment β€” AAU).


Model Details

Property Value
Base model mistralai/Mistral-7B-Instruct-v0.2
Adapter type QLoRA (4-bit NF4)
LoRA rank r=16, alpha=32
Target modules q, k, v, o, gate, up, down projections
Max sequence length 512 tokens
Training epochs 1 (full)
Effective batch size 32 (batch=8, grad_accum=4)
Learning rate 2e-4 with cosine scheduler
Warmup ratio 0.05
Precision bfloat16
Hardware NVIDIA L4 GPU
Trainer SFTTrainer (trl 0.29.0)

Training Data

  • Source: patents_50k_green.parquet β€” 50,000 patent claims with Y02 silver labels derived from CPC codes
  • Train split: 28,500 rows (train_silver, 95%)
  • Eval split: 1,500 rows (5% held-out, stratified)
  • Label balance: 50% GREEN / 50% NOT GREEN
  • Prompt format: Mistral [INST]...[/INST] chat template
  • Target output: Strict JSON β€” {"is_green": 0/1, "rationale": "one sentence"}

Training History

| Version | Epochs | Trainer | Result | |---|---|---|---|| | V2 (this model) | 1.0 (full) | SFTTrainer | Stable β€” loss 0.87β†’0.83 | V2 used Mistral [INST] template and trained to completion, resuming from checkpoint-800 after an SSH disconnection at step 725/891.


Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

BASE_MODEL  = "mistralai/Mistral-7B-Instruct-v0.2"
ADAPTER_DIR = "qlora_mistral_y02_V2"

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
)

tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
base      = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL, quantization_config=bnb, device_map="auto"
)
model = PeftModel.from_pretrained(base, ADAPTER_DIR)
model.eval()

claim = "A photovoltaic solar panel system for residential energy generation."

prompt = (
    "You are an expert patent examiner for Y02 green technology. "
    "Classify the following patent claim as GREEN (1) or NOT GREEN (0).\n\n"
    'Return STRICT JSON only: {"is_green": 0 or 1, "rationale": "one sentence"}\n\n'
    f"Patent claim:\n{claim}"
)

messages  = [{"role": "user", "content": prompt}]
formatted = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
inputs = tokenizer(formatted, return_tensors="pt").to(model.device)

with torch.no_grad():
    out = model.generate(
        **inputs,
        max_new_tokens=150,
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id,
    )

response = tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], 
                             skip_special_tokens=True)
print(response)
# {"is_green": 1, "rationale": "Photovoltaic system directly generates 
#  renewable electricity, qualifying under Y02E 10/50."}

Role in MAS Pipeline

This adapter was used as the Judge agent in a 3-agent pipeline:

Advocate (this model) β†’ argues FOR green classification
Skeptic  (this model) β†’ argues AGAINST green classification  
Judge    (this model) β†’ weighs both sides β†’ final JSON verdict

All 3 agents share this same model β€” differentiated only by role-specific prompts. HITL was triggered when confidence < 0.65 or deadlock=True.


Framework Versions

Library Version
PEFT 0.18.1
TRL 0.29.0
Transformers 4.57.6
PyTorch 2.9.1
Datasets 4.6.1
Tokenizers 0.22.2
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for alinashrestha/qlora-mistral-y02-v2

Adapter
(1248)
this model