Patent Green Technology Classifier (PatentSBERTa Fine-tuned + MAS)

A binary text classifier for detecting green/sustainable technology patent claims, built on top of AI-Growth-Lab/PatentSBERTa. This is the Assignment 3 model, extending Assignment 2 by replacing the simple LLM labeling step with a three-agent debate system (MAS).

Model Description

This model was fine-tuned on a balanced dataset of 35,100 patent claims with gold-enhanced labels derived from a Multi-Agent System (MAS) debate pipeline followed by a Human-in-the-Loop (HITL) review. It classifies patent claims as either green technology (1) or not green technology (0).

Training Data

Base dataset: AI-Growth-Lab/patents_claims_1.5m_traim_test
Silver labels: Derived from CPC Y02* codes (25,000 green + 25,000 not green)
Gold labels: 100 examples labeled via MAS debate → Human HITL workflow
Dataset: alexchrander/patents-green-mas-dataset

Training Procedure

Active Learning + MAS + HITL workflow:

Reused frozen PatentSBERTa baseline and uncertainty scores from Assignment 2
Selected the same 100 most uncertain examples via uncertainty sampling
Used a three-agent debate system to suggest labels:
- Advocate (Mistral-7B-Instruct-v0.2) — argues FOR green classification
- Skeptic (Qwen2.5-7B-Instruct) — argues AGAINST green classification
- Judge (Meta-Llama-3-8B-Instruct) — weighs both arguments and produces final label
Human reviewer assigned final gold labels based on the full debate
Fine-tuned PatentSBERTa on the gold-enhanced dataset

Hyperparameters:

max_seq_length: 256
epochs: 1
learning_rate: 2e-5
batch_size: 16

Results

Comparison across all model versions

Model Version	Training Data Source	F1	Accuracy
Baseline (frozen)	Frozen Embeddings (No Fine-tuning)	0.77	0.77
Assignment 2 Model	Fine-tuned on Silver + Gold (Simple LLM)	0.81	0.81
Assignment 3 Model (this model)	Fine-tuned on Silver + Gold (MAS)	0.81	0.81

MAS vs Simple LLM label quality

	Not Green	Green	Low Confidence
Assignment 2 (Mistral)	95	5	72%
Assignment 3 (MAS)	51	47	4%

The MAS produced significantly more balanced and confident labels than the simple LLM approach, though both models achieved the same downstream F1 score of 0.81.

Video

https://panopto.aau.dk/Panopto/Pages/Viewer.aspx?id=5283748b-c71c-473c-89ec-b3f9016361f4

Downloads last month: 3

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for alexchrander/patent-sberta-green-finetuned-mas

Base model

AI-Growth-Lab/PatentSBERTa

Finetuned

(20)

this model

alexchrander
/

patent-sberta-green-finetuned-mas