Patent Green Technology Classifier (PatentSBERTa Fine-tuned)
A binary text classifier for detecting green/sustainable technology patent claims, built on top of AI-Growth-Lab/PatentSBERTa.
Model Description
This model was fine-tuned on a balanced dataset of 35,100 patent claims with gold-enhanced labels derived from a Human-in-the-Loop (HITL) workflow. It classifies patent claims as either green technology (1) or not green technology (0).
Training Data
- Base dataset: AI-Growth-Lab/patents_claims_1.5m_traim_test
- Silver labels: Derived from CPC Y02* codes (25,000 green + 25,000 not green)
- Gold labels: 100 examples labeled via LLM → Human HITL workflow
- Dataset: alexchrander/patents-green-gold-dataset
Training procedure
Active Learning + HITL workflow:
- Trained a frozen PatentSBERTa baseline using Logistic Regression
- Applied uncertainty sampling to identify the 100 most uncertain examples
- Used Mistral-7B-Instruct-v0.2 to suggest labels with rationale
- Human reviewer assigned final gold labels, overriding the LLM in 6/100 cases
- Fine-tuned PatentSBERTa on the gold-enhanced dataset
Hyperparameters:
- max_seq_length: 256
- epochs: 1
- learning_rate: 2e-5
- batch_size: 16
Results
| Precision | Recall | F1 | Accuracy | |
|---|---|---|---|---|
| Baseline (frozen) | 0.77 | 0.77 | 0.77 | 0.77 |
| Fine-tuned (this model) | 0.81 | 0.81 | 0.81 | 0.81 |
HITL Override Examples
The human reviewer overrode the LLM suggestion in 6 out of 100 cases, all from not green → green:
- A phosphate detection method in soil and groundwater — labeled green as it relates to monitoring agricultural and water contamination
- A substrate coating method reducing film thickness — labeled green as material reduction can be considered a sustainable practice
- A surfactant removal method using electrochemical oxidation — labeled green as surfactant removal is associated with clean chemistry
Video
https://panopto.aau.dk/Panopto/Pages/Viewer.aspx?id=a519a0b3-17e2-44a5-b956-b3f90160c1c5
- Downloads last month
- 1
Model tree for alexchrander/patent-sberta-green-finetuned
Base model
AI-Growth-Lab/PatentSBERTa