Model Card: PatentSBERTa Fine-Tuned on Green Patent Claims (Assignment 3)
Model Summary
This model is a fine-tuned version of AI-Growth-Lab/PatentSBERTa for binary classification of patent claims as green technology (Y02) or not. It was developed as part of Assignment 3 in the Applied Deep Learning and AI course at Aalborg University. Compared to Assignment 2, this model uses a more advanced Multi-Agent System (MAS) to generate higher-quality gold labels for the 100 high-risk claims before fine-tuning PatentSBERTa.
Model Details
- Developed by: Anders Sønderbý (as58zr@student.aau.dk)
- Model type: Sentence Transformer with classification head (binary)
- Base model: AI-Growth-Lab/PatentSBERTa
- Language: English
- License: MIT
- Task: Binary text classification — Green Technology (Y02) vs. Not Green
What This Model Does
Given the text of a patent claim, the model predicts whether the claim relates to green technology as defined by the CPC Y02 classification system. The output is a binary label:
1— Green technology (Y02)0— Not green technology
Key Difference from Assignment 2
In Assignment 2, a single generic LLM was used to suggest labels for the 100 high-risk claims before human review. In Assignment 3, a Multi-Agent System (MAS) using CrewAI was used instead, where three specialised agents debated each claim before producing a final label. The hypothesis is that adversarial debate between agents produces higher-quality gold labels, which in turn produces a better fine-tuned PatentSBERTa model.
Training Pipeline Overview
Stage 1 & 2 — Setup (Same as Assignment 2)
The same patents_50k_green.parquet balanced 50k dataset was used. Uncertainty scores were recomputed from the Assignment 2 baseline model and the same top 100 high-risk claims (hitl_green_100.csv) were selected for labeling.
Stage 3 — Multi-Agent Labeling (CrewAI)
Three agents debated each of the 100 high-risk patent claims:
| Agent | Role | Objective |
|---|---|---|
| Advocate | Green Patent Expert | Argue why the claim qualifies as Y02 green technology |
| Skeptic | Greenwashing Analyst | Challenge the Y02 classification and identify greenwashing |
| Judge | Senior Patent Examiner | Weigh both arguments and produce a final JSON label + rationale |
The Judge output for each claim: {"label": 0 or 1, "rationale": "2-3 sentence explanation"}
The LLM used for all three agents was groq/meta-llama/llama-4-scout-17b-16e-instruct via the Groq API.
Stage 4 — Human Review (HITL)
A human reviewer assessed all 100 claims using the agent arguments and Judge rationale as context. The final gold label (is_green_gold) reflects the human decision, with the AI rationale available as supporting context.
Stage 5 — Fine-Tuning PatentSBERTa
PatentSBERTa was fine-tuned for binary classification using the combined train_silver + gold_100 dataset, where gold labels override silver labels for the 100 HITL-reviewed claims.
Training Data
- Dataset: Derived from AI-Growth-Lab/patents_claims_1.5m_traim_test
- Working file: patents_50k_green.parquet — a balanced 50k sample (25,000 green, 25,000 not green)
- Silver label source: CPC Y02* classification codes (is_green_silver)
- Gold labels: 100 human-reviewed claims labeled via MAS debate (is_green_gold)
Dataset Splits
| Split | Size | Description |
|---|---|---|
| train_silver | ~40,000 | Silver-labeled training set (CPC-derived) |
| eval_silver | ~5,000 | Silver-labeled evaluation set |
| pool_unlabeled | ~5,000 | Unlabeled pool used for uncertainty sampling |
| gold_100 | 100 | Human-reviewed high-uncertainty claims (MAS-assisted) |
Training Hyperparameters
| Parameter | Value |
|---|---|
| Base model | AI-Growth-Lab/PatentSBERTa |
| Max sequence length | 256 |
| Epochs | 1 |
| Learning rate | 2e-5 |
| Training set size | ~40,100 (train_silver + gold_100) |
Evaluation Results
| Evaluation Set | F1 Score | Notes |
|---|---|---|
| eval_silver (5,000) | 0.824 | Primary evaluation metric |
| gold_100 (100) | 0.667 | Human-reviewed high-uncertainty claims |
Comparative Analysis
| Model Version | Training Data Source | F1 Score |
|---|---|---|
| 1. Baseline | Frozen Embeddings (No Fine-tuning) | 0.780 |
| 2. Assignment 2 Model | Fine-tuned on Silver + Gold (Simple LLM) | 0.818 |
| 3. Assignment 3 Model | Fine-tuned on Silver + Gold (MAS - CrewAI) | 0.824 |
The MAS approach produced a modest improvement in F1 score (+0.006) over the simple LLM approach from Assignment 2. While the improvement is small, the adversarial debate structure between Advocate and Skeptic agents likely produced more nuanced and reliable gold labels for the high-risk claims, particularly for borderline cases where a single LLM might have been overconfident. The added engineering complexity of the MAS is partially justified by the quality improvement, though the marginal gain suggests that the bottleneck may lie in the size of the gold label set (100 claims) rather than label quality alone.
HITL Agreement Reporting
Human-AI agreement was tracked for both Assignment 2 and Assignment 3:
| Assignment | Labeling Method | Human-AI Agreement |
|---|---|---|
| Assignment 2 | Simple generic LLM | [97 %] |
| Assignment 3 | Multi-Agent System (MAS) | [94 %] |
Intended Use
- Primary use: Academic research and coursework in patent classification
- Intended users: Course instructors and students at Aalborg University
- Out-of-scope: Production patent classification systems, legal patent assessment, or any commercial use
Limitations
- Trained on a balanced 50k sample — performance may differ on the full unbalanced patent corpus
- Silver labels are derived from CPC codes, which may contain noise
- Gold labels are based on 100 claims only — a larger gold set would likely improve downstream performance more significantly
- The MAS agents occasionally showed bias, with the Advocate tending to over-generalise green characteristics and the Judge sometimes deferring too strongly to one agent's argument
Repository
The full code, notebooks, and data files for this assignment are available in the course GitHub repository.
- Downloads last month
- 2
Model tree for Anders-sonderby/patentsbert-finetune_1
Base model
AI-Growth-Lab/PatentSBERTa