Model Card: Mistral-7B QLoRA Fine-Tuned on Green Patent Claims (Final Assignment)

Model Summary

This is a QLoRA fine-tuned adapter for mistralai/Mistral-7B-v0.1, adapted for domain-specific classification of patent claims as green technology (Y02) or not. It was developed as part of the Final Assignment in the Applied Deep Learning and AI course at Aalborg University.

The model serves two purposes in the final pipeline:

  1. Domain adaptation — learning the dense linguistic style of patent claims and the logic of Y02 classifications
  2. Judge agent — acting as the reasoning core of the Multi-Agent System (MAS) that labels 100 high-risk patent claims

Model Details

  • Developed by: Anders Sønderbý (as58zr@student.aau.dk)
  • Model type: Causal LLM with QLoRA adapter (PEFT)
  • Base model: mistralai/Mistral-7B-v0.1
  • Language: English
  • License: MIT
  • Task: Instruction-tuned binary classification — Green Technology (Y02) vs. Not Green

What This Model Does

Given the text of a patent claim formatted as an instruction prompt, the model completes the classification:

### Task: Classify the following patent claim as green technology (Y02) or not.

### Claim:
[patent claim text]

### Answer: YES / NO

In the MAS pipeline, the model is prompted as a Judge to produce structured JSON output weighing arguments from an Advocate and a Skeptic agent:

{"label": 0 or 1, "confidence": 0.0-1.0, "rationale": "2-3 sentence explanation"}

How to Load This Model

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

BASE_MODEL = "mistralai/Mistral-7B-v0.1"
ADAPTER = "Anders-sonderby/mistral-7b-patent-qlora"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
tokenizer.pad_token = tokenizer.eos_token

base_model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL,
    quantization_config=bnb_config,
    device_map="auto",
)
model = PeftModel.from_pretrained(base_model, ADAPTER)

Training Pipeline

Fine-Tuning Approach: QLoRA

Rather than updating all 7 billion parameters, QLoRA freezes the base model weights in 4-bit precision and injects small trainable LoRA adapter matrices into the attention layers. Only ~0.5% of parameters are trained, drastically reducing memory requirements while preserving model quality.

Training Data

  • Source: train_silver.csv — 40,000 patent claims with silver labels derived from CPC Y02* codes
  • Label balance: ~50/50 (20,010 green, 19,990 not green)
  • Prompt format: Instruction-tuning format (### Task / ### Claim / ### Answer: YES/NO)
  • Split: 38,000 train / 2,000 eval (5% held out)

QLoRA Configuration

Parameter Value
Base model mistralai/Mistral-7B-v0.1
Quantization 4-bit NF4 with double quantization
Compute dtype bfloat16
LoRA rank (r) 16
LoRA alpha 32
Target modules q_proj, v_proj
LoRA dropout 0.05
Trainable parameters ~0.5% of total

Training Hyperparameters

Parameter Value
Epochs 1
Per device batch size 4
Gradient accumulation steps 4 (effective batch size: 16)
Learning rate 2e-4
LR scheduler Cosine
Warmup steps 50
Precision bf16
Hardware NVIDIA L4 (24GB VRAM)
Cluster AAU AI-Lab (SLURM batch job)

Evaluation Results (QLoRA Model)

Metric Value
Eval loss 1.1804
Eval samples/second 10.8
Eval runtime 185s

Note: Eval loss reflects the causal language modelling objective on the held-out 2,000 samples. The model is evaluated on its ability to complete the ### Answer: token correctly.


Role in the Multi-Agent System (MAS)

This model acts as the Judge in a three-agent debate pipeline for labeling 100 high-risk patent claims:

Agent Model Role
Advocate microsoft/Phi-3-mini-4k-instruct (4-bit) Argues for Y02 green classification
Skeptic microsoft/Phi-3-mini-4k-instruct (4-bit) Challenges the classification, identifies greenwashing
Judge Anders-sonderby/mistral-7b-patent-qlora Weighs both arguments, produces final label + confidence + rationale

Claims where the Judge's confidence fell below 0.70 were flagged for targeted human review (Exception-Based HITL), reducing manual effort compared to reviewing all 100 claims.


Downstream Impact on PatentSBERTa

The gold labels produced by this MAS pipeline were used to fine-tune PatentSBERTa for the final model version:

Model Version Training Data Source F1 Score
1. Baseline Frozen Embeddings (No Fine-tuning) 0.780
2. Assignment 2 Fine-tuned on Silver + Gold (Simple LLM) 0.818
3. Assignment 3 Fine-tuned on Silver + Gold (MAS - CrewAI) 0.824
4. Final Model Fine-tuned on Silver + Gold (QLoRA MAS + Targeted HITL) [Your F1 here]

Engineering Notes

  • The instruction format (### Answer: YES/NO) creates a tension when prompting the Judge to output structured JSON — the model was not trained to produce confidence scores, requiring careful prompt engineering and a JSON fallback parser
  • Two models (Phi-3-mini + Mistral QLoRA) were loaded simultaneously on a single L4 GPU using 4-bit quantization and sequential CUDA cache clearing to stay within 24GB VRAM
  • The QLoRA adapter must be loaded via PeftModel.from_pretrained() on top of the frozen base model — it cannot be loaded as a standalone model

Intended Use

  • Primary use: Academic research and coursework in patent classification
  • Intended users: Course instructors and students at Aalborg University
  • Out-of-scope: Production patent classification, legal patent assessment, or any commercial use

Limitations

  • Trained for 1 epoch only — additional epochs would likely improve classification accuracy
  • The YES/NO instruction format does not produce confidence scores natively, making structured JSON output fragile at inference time
  • The 512-character truncation may lose relevant technical context in longer patent claims
  • No chain-of-thought reasoning was included in training, limiting the depth of the model's rationale generation

Repository

The full code, notebooks, and data files for this assignment are available in the course GitHub repository.

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Anders-sonderby/mistral-7b-patent-qlora

Adapter
(2454)
this model