Italian NER for Browser-Only PII Anonymization (BERT uncased, Quantized ONNX)

A lightweight Italian Named Entity Recognition model optimized for browser-only inference, based on:

osiria/bert-italian-uncased-ner
License: Apache-2.0
Original authors: Osiria

This repository provides a quantized ONNX version (~105 MB) suitable for running entirely in the browser via Transformers.js.

What this model is for

This model is designed as a privacy-friendly pre-filter layer to detect and help anonymize:

Person names (PER)
Organizations (ORG)
Locations (LOC)
Miscellaneous named entities (MISC)

Typical use case:

Run NER locally in the user's browser before sending text to an LLM, masking personal identifiers first.

All inference can run client-side.

Private Evaluation Protocol (Different Label Space)

The model predicts standard NER labels (PER/LOC/ORG/MISC/O), while the evaluation dataset is PII-oriented and uses a different schema.

To evaluate fairly, we use a private protocol (ner_compatible) that compares only compatible labels:

Gold labels are mapped to PER, LOC, or O.
Non-comparable PII classes are excluded from this score.
Predictions are projected token-level and evaluated with both BIO seqeval metrics and token-level type metrics.

Reproducible script with English comments:

evaluation/private_eval_procedure.py

Example:

python3 evaluation/private_eval_procedure.py \
  --validation-script ./validate.py \
  --mapping-mode ner_compatible \
  --max-examples 10197 \
  --debug 10 \
  --json-out ./evaluation/private_eval_latest.json

Why this version

Compared to the original PyTorch model:

Converted to ONNX
Dynamically quantized
Packaged for Transformers.js compatibility
Optimized for ONNX Runtime Web
Suitable for fully local browser execution

This repository is quantized-only:

includes onnx/model_quantized.onnx
does not include non-quantized ONNX weights

Use in the Browser (Transformers.js)

import { pipeline, env } from "@huggingface/transformers";

env.allowRemoteModels = true;
env.backends.onnx.wasm.simd = true;
env.backends.onnx.wasm.numThreads = 2;

const ner = await pipeline(
  "token-classification",
  "laibniz/italian-ner-pii-browser-uncased",
  {
    quantized: true,
    aggregation_strategy: "simple"
  }
);

const text = "Il paziente mario rossi vive a Milano.";
const entities = await ner(text);
console.log(entities);

All inference runs locally in the browser.

Attribution

This work builds upon:

osiria/bert-italian-uncased-ner
Apache-2.0 License
https://huggingface.co/osiria/bert-italian-uncased-ner

All credit for model training and dataset preparation belongs to the original authors.

This repository provides ONNX export and quantized packaging for browser use.

Downloads last month: 4

Model tree for Laibniz/italian-ner-pii-browser-uncased

Base model

osiria/bert-italian-uncased-ner

Quantized

(1)

this model

Dataset used to train Laibniz/italian-ner-pii-browser-uncased

Space using Laibniz/italian-ner-pii-browser-uncased 1

Evaluation results

F1 (seqeval overall, BIO) on DeepMount00/pii-masking-ita (private protocol: ner_compatible)
Private evaluation procedure script

0.526
Precision (PER, seqeval BIO) on DeepMount00/pii-masking-ita (private protocol: ner_compatible)
Private evaluation procedure script

0.643
Recall (PER, seqeval BIO) on DeepMount00/pii-masking-ita (private protocol: ner_compatible)
Private evaluation procedure script

0.727
F1 (PER, seqeval BIO) on DeepMount00/pii-masking-ita (private protocol: ner_compatible)
Private evaluation procedure script

0.683
Precision (LOC, seqeval BIO) on DeepMount00/pii-masking-ita (private protocol: ner_compatible)
Private evaluation procedure script

0.203
Recall (LOC, seqeval BIO) on DeepMount00/pii-masking-ita (private protocol: ner_compatible)
Private evaluation procedure script

0.374
F1 (LOC, seqeval BIO) on DeepMount00/pii-masking-ita (private protocol: ner_compatible)
Private evaluation procedure script

0.263
F1 (macro PER/LOC, token-level) on DeepMount00/pii-masking-ita (private protocol: ner_compatible)
Private evaluation procedure script

0.728
Token Accuracy on DeepMount00/pii-masking-ita (private protocol: ner_compatible)
Private evaluation procedure script

0.984