Model Card: Mistral-7B QLoRA Fine-Tuned on Green Patent Claims (Final Assignment)
Model Summary
This is a QLoRA fine-tuned adapter for mistralai/Mistral-7B-v0.1, adapted for domain-specific classification of patent claims as green technology (Y02) or not. It was developed as part of the Final Assignment in the Applied Deep Learning and AI course at Aalborg University.
The model serves two purposes in the final pipeline:
- Domain adaptation — learning the dense linguistic style of patent claims and the logic of Y02 classifications
- Judge agent — acting as the reasoning core of the Multi-Agent System (MAS) that labels 100 high-risk patent claims
Model Details
- Developed by: Anders Sønderbý (as58zr@student.aau.dk)
- Model type: Causal LLM with QLoRA adapter (PEFT)
- Base model: mistralai/Mistral-7B-v0.1
- Language: English
- License: MIT
- Task: Instruction-tuned binary classification — Green Technology (Y02) vs. Not Green
What This Model Does
Given the text of a patent claim formatted as an instruction prompt, the model completes the classification:
### Task: Classify the following patent claim as green technology (Y02) or not.
### Claim:
[patent claim text]
### Answer: YES / NO
In the MAS pipeline, the model is prompted as a Judge to produce structured JSON output weighing arguments from an Advocate and a Skeptic agent:
{"label": 0 or 1, "confidence": 0.0-1.0, "rationale": "2-3 sentence explanation"}
How to Load This Model
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
BASE_MODEL = "mistralai/Mistral-7B-v0.1"
ADAPTER = "Anders-sonderby/mistral-7b-patent-qlora"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
tokenizer.pad_token = tokenizer.eos_token
base_model = AutoModelForCausalLM.from_pretrained(
BASE_MODEL,
quantization_config=bnb_config,
device_map="auto",
)
model = PeftModel.from_pretrained(base_model, ADAPTER)
Training Pipeline
Fine-Tuning Approach: QLoRA
Rather than updating all 7 billion parameters, QLoRA freezes the base model weights in 4-bit precision and injects small trainable LoRA adapter matrices into the attention layers. Only ~0.5% of parameters are trained, drastically reducing memory requirements while preserving model quality.
Training Data
- Source:
train_silver.csv— 40,000 patent claims with silver labels derived from CPC Y02* codes - Label balance: ~50/50 (20,010 green, 19,990 not green)
- Prompt format: Instruction-tuning format (
### Task / ### Claim / ### Answer: YES/NO) - Split: 38,000 train / 2,000 eval (5% held out)
QLoRA Configuration
| Parameter | Value |
|---|---|
| Base model | mistralai/Mistral-7B-v0.1 |
| Quantization | 4-bit NF4 with double quantization |
| Compute dtype | bfloat16 |
| LoRA rank (r) | 16 |
| LoRA alpha | 32 |
| Target modules | q_proj, v_proj |
| LoRA dropout | 0.05 |
| Trainable parameters | ~0.5% of total |
Training Hyperparameters
| Parameter | Value |
|---|---|
| Epochs | 1 |
| Per device batch size | 4 |
| Gradient accumulation steps | 4 (effective batch size: 16) |
| Learning rate | 2e-4 |
| LR scheduler | Cosine |
| Warmup steps | 50 |
| Precision | bf16 |
| Hardware | NVIDIA L4 (24GB VRAM) |
| Cluster | AAU AI-Lab (SLURM batch job) |
Evaluation Results (QLoRA Model)
| Metric | Value |
|---|---|
| Eval loss | 1.1804 |
| Eval samples/second | 10.8 |
| Eval runtime | 185s |
Note: Eval loss reflects the causal language modelling objective on the held-out 2,000 samples. The model is evaluated on its ability to complete the ### Answer: token correctly.
Role in the Multi-Agent System (MAS)
This model acts as the Judge in a three-agent debate pipeline for labeling 100 high-risk patent claims:
| Agent | Model | Role |
|---|---|---|
| Advocate | microsoft/Phi-3-mini-4k-instruct (4-bit) | Argues for Y02 green classification |
| Skeptic | microsoft/Phi-3-mini-4k-instruct (4-bit) | Challenges the classification, identifies greenwashing |
| Judge | Anders-sonderby/mistral-7b-patent-qlora | Weighs both arguments, produces final label + confidence + rationale |
Claims where the Judge's confidence fell below 0.70 were flagged for targeted human review (Exception-Based HITL), reducing manual effort compared to reviewing all 100 claims.
Downstream Impact on PatentSBERTa
The gold labels produced by this MAS pipeline were used to fine-tune PatentSBERTa for the final model version:
| Model Version | Training Data Source | F1 Score |
|---|---|---|
| 1. Baseline | Frozen Embeddings (No Fine-tuning) | 0.780 |
| 2. Assignment 2 | Fine-tuned on Silver + Gold (Simple LLM) | 0.818 |
| 3. Assignment 3 | Fine-tuned on Silver + Gold (MAS - CrewAI) | 0.824 |
| 4. Final Model | Fine-tuned on Silver + Gold (QLoRA MAS + Targeted HITL) | [Your F1 here] |
Engineering Notes
- The instruction format (
### Answer: YES/NO) creates a tension when prompting the Judge to output structured JSON — the model was not trained to produce confidence scores, requiring careful prompt engineering and a JSON fallback parser - Two models (Phi-3-mini + Mistral QLoRA) were loaded simultaneously on a single L4 GPU using 4-bit quantization and sequential CUDA cache clearing to stay within 24GB VRAM
- The QLoRA adapter must be loaded via
PeftModel.from_pretrained()on top of the frozen base model — it cannot be loaded as a standalone model
Intended Use
- Primary use: Academic research and coursework in patent classification
- Intended users: Course instructors and students at Aalborg University
- Out-of-scope: Production patent classification, legal patent assessment, or any commercial use
Limitations
- Trained for 1 epoch only — additional epochs would likely improve classification accuracy
- The YES/NO instruction format does not produce confidence scores natively, making structured JSON output fragile at inference time
- The 512-character truncation may lose relevant technical context in longer patent claims
- No chain-of-thought reasoning was included in training, limiting the depth of the model's rationale generation
Repository
The full code, notebooks, and data files for this assignment are available in the course GitHub repository.
- Downloads last month
- 1
Model tree for Anders-sonderby/mistral-7b-patent-qlora
Base model
mistralai/Mistral-7B-v0.1