Patent Green Technology Classifier (PatentSBERTa Fine-tuned + MAS)

A binary text classifier for detecting green/sustainable technology patent claims, built on top of AI-Growth-Lab/PatentSBERTa. This is the Assignment 3 model, extending Assignment 2 by replacing the simple LLM labeling step with a three-agent debate system (MAS).

Model Description

This model was fine-tuned on a balanced dataset of 35,100 patent claims with gold-enhanced labels derived from a Multi-Agent System (MAS) debate pipeline followed by a Human-in-the-Loop (HITL) review. It classifies patent claims as either green technology (1) or not green technology (0).

Training Data

Training Procedure

Active Learning + MAS + HITL workflow:

  1. Reused frozen PatentSBERTa baseline and uncertainty scores from Assignment 2
  2. Selected the same 100 most uncertain examples via uncertainty sampling
  3. Used a three-agent debate system to suggest labels:
    • Advocate (Mistral-7B-Instruct-v0.2) — argues FOR green classification
    • Skeptic (Qwen2.5-7B-Instruct) — argues AGAINST green classification
    • Judge (Meta-Llama-3-8B-Instruct) — weighs both arguments and produces final label
  4. Human reviewer assigned final gold labels based on the full debate
  5. Fine-tuned PatentSBERTa on the gold-enhanced dataset

Hyperparameters:

  • max_seq_length: 256
  • epochs: 1
  • learning_rate: 2e-5
  • batch_size: 16

Results

Comparison across all model versions

Model Version Training Data Source F1 Accuracy
Baseline (frozen) Frozen Embeddings (No Fine-tuning) 0.77 0.77
Assignment 2 Model Fine-tuned on Silver + Gold (Simple LLM) 0.81 0.81
Assignment 3 Model (this model) Fine-tuned on Silver + Gold (MAS) 0.81 0.81

MAS vs Simple LLM label quality

Not Green Green Low Confidence
Assignment 2 (Mistral) 95 5 72%
Assignment 3 (MAS) 51 47 4%

The MAS produced significantly more balanced and confident labels than the simple LLM approach, though both models achieved the same downstream F1 score of 0.81.

Video

https://panopto.aau.dk/Panopto/Pages/Viewer.aspx?id=5283748b-c71c-473c-89ec-b3f9016361f4

Downloads last month
3
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for alexchrander/patent-sberta-green-finetuned-mas

Finetuned
(20)
this model

Datasets used to train alexchrander/patent-sberta-green-finetuned-mas