IrishCore-GlobalPointer-ContextPII-135M-v1-rc22

IrishCore-GlobalPointer-ContextPII-135M-v1-rc22 is the current expanded-label raw-only PII masking release for Irish public-sector, HSE, and citizen-support flows.

It keeps the same DistilBERT-size GlobalPointer span extractor family and the same weights as temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc21, but ships a stronger bundled decoder for contextual Irish/English date-of-birth disambiguation and packages the new unseen v16 holdout while preserving the same raw-only deployment shape.

Context labels served by this line:

  • STREET_ADDRESS
  • CITY
  • COUNTY
  • DATE_OF_BIRTH
  • AGE

Core labels retained:

  • PPSN
  • POSTCODE
  • PHONE_NUMBER
  • EMAIL
  • PASSPORT_NUMBER
  • ACCOUNT_NUMBER
  • BANK_ROUTING_NUMBER
  • SWIFT_BIC
  • CREDIT_DEBIT_CARD
  • FIRST_NAME
  • LAST_NAME

Positioning

rc22 is a decoder/runtime hardening release over rc21.

  • weights unchanged
  • ONNX graph unchanged
  • no external scanner or validator added
  • deployment path still single-pass span extraction plus deterministic [PII:LABEL] replacement

What changed in rc22:

  • Gaelic appointment-vs-DOB sentences no longer allow the appointment date to survive as DATE_OF_BIRTH when the true DOB cue appears later in the same sentence
  • the packaged q8 release now ships a fresh unseen v16 exact holdout covering month-name DOBs, public/private contact mixes, code-switched addresses, and county-office negatives
  • the serving architecture is still raw-only single-pass GlobalPointer span extraction plus deterministic [PII:LABEL] replacement
  • a model-level continuation on patch v4 was trained locally but not promoted because it did not beat the current decoder-only serving path on the gated suites

If you only need the narrower Irish-core structured label set and want maximum CPU throughput, temsa/IrishCore-GlobalPointer-135M-v1-rc4 remains the faster option.

Benchmarks

ONNX q8

Suite F1 Examples/s
Irish core 1.0000 106.7838
Irish extended 1.0000 90.9442
Gov contact policy v1 1.0000 69.1441
Gov chatbot red-team v2 0.9861 106.2109
Gov chatbot gap holdout v2 1.0000 52.6961
Context red-team v11 exact 1.0000 191.7066
Context holdout v12 exact 1.0000 123.0474
Context holdout v13 exact 1.0000 53.8701
Context holdout v14 exact 1.0000 112.3812
Context holdout v15 exact 1.0000 46.4171
Context holdout v16 exact 1.0000 66.8241
Multilingual PPSN overall 0.9333 86.0181
Multilingual PPSN label-only 1.0000 โ€”

Comparison

Model Irish core q8 F1 Gov chatbot gap holdout v2 q8 F1 Context holdout v16 exact q8 F1 Q8 core examples/s
ContextPII rc22 q8 1.0000 1.0000 1.0000 106.7838
ContextPII rc21 q8 1.0000 1.0000 1.0000 150.7792
ContextPII rc20 q8 1.0000 1.0000 1.0000 181.8821
ContextPII rc19 q8 1.0000 1.0000 โ€” 73.0554
ContextPII rc18 q8 1.0000 1.0000 โ€” 111.3135
GlobalPointer core rc4 q8 1.0000 1.0000 โ€” 221.5743

Evaluation Notes

Additional q8 release checks shipped in this repo:

  • eval/q8_globalpointer_context_redteam_v11_exact.json: legacy exact-boundary contextual regression suite, now 1.0000
  • eval/q8_globalpointer_context_holdout_v12_exact.json: fresh unseen holdout for prefixed Irish/English address recovery, now 1.0000
  • eval/q8_globalpointer_context_holdout_v13_exact.json: fresh unseen holdout for house-name and city/county field recovery, now 1.0000
  • eval/q8_globalpointer_context_holdout_v14_exact.json: fresh unseen holdout for Gaelic surname-particle names, apostrophe surnames after self-cues, and mixed structured/contextual packets, now 1.0000
  • eval/q8_globalpointer_context_holdout_v15_exact.json: fresh unseen holdout for multiline form fields, line-1 prefix addresses, date-before-cue DOBs, appointment-vs-DOB disambiguation, and code-switched address turns, now 1.0000
  • eval/q8_globalpointer_context_holdout_v16_exact.json: fresh unseen holdout for month-name DOBs, public/private contact mixes, code-switched addresses, and county-office negatives, now 1.0000
  • eval/q8_irish_gov_chatbot_gap_holdout_v2.json: legacy chatbot gap suite, now 1.0000

Known tradeoff:

  • eval/q8_irish_gov_chatbot_redteam_v2.json remains 0.9861 because this broader contextual release masks public-office street/city details in assistant text, while that older suite only labels postcode and phone there.

Usage

python3 inference_mask_onnx.py --model temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc22 --text "My name is Maeve O'Sullivan, my PPSN is 1234567T, and my address is รrasรกn 4, Cedar Court, mBaile รtha Cliath, D08 XY12."

Portfolio Comparison

Updated: 2026-03-16.

Use this section for the fastest public comparison across the temsa PII masking portfolio.

  • The first core table only includes public checkpoints that ship both comparable q8 accuracy and q8 CPU throughput.
  • The first PPSN table only includes public artifacts that ship comparable PPSN accuracy and CPU throughput.
  • Missing cells in the archive tables mean the older release did not ship that metric in its public bundle.
  • DiffMask rows use the reconciled clean_single_pass harness that matches the deployed runtime.
  • GlobalPointer rows use the public raw-only span-matrix release bundle and its packaged q8 ONNX artifact.
  • The same content is shipped as PORTFOLIO_COMPARISON.md inside each public model repo.

Irish Core PII: Comparable Public Checkpoints

Repo Stack Full Core F1 Q8 Core F1 Q8 Multilingual PPSN F1 Q8 Core ex/s
temsa/IrishCore-GlobalPointer-ContextPII-4L-122M-v1-rc6 4-layer GlobalPointer distilled fast student 1.0000 1.0000 0.9333 282.9
temsa/IrishCore-GlobalPointer-ContextPII-4L-122M-v1-rc5 4-layer GlobalPointer distilled fast student 1.0000 1.0000 0.9333 282.9
temsa/IrishCore-GlobalPointer-ContextPII-4L-122M-v1-rc3 4-layer GlobalPointer distilled fast student 1.0000 1.0000 0.9333 317.9
temsa/IrishCore-GlobalPointer-ContextPII-4L-122M-v1-rc2 4-layer GlobalPointer distilled fast student 1.0000 1.0000 0.9333 292.5
temsa/IrishCore-GlobalPointer-ContextPII-4L-122M-v1-rc1 4-layer GlobalPointer distilled fast student 1.0000 1.0000 0.9333 337.3
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc29 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 232.7
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc28 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 232.7
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc25 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 212.1
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc24 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 278.9
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc23 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 237.6
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc22 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 106.8
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc21 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 150.8
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc20 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 181.9
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc19 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 73.1
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc18 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 126.2
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc17 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 125.5
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc16 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 125.5
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc15 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 125.5
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc14 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 119.2
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc13 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 126.1
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc12 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 73.6
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc11 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 94.1
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc10 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 125.8
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc9 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 119.8
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc8 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 128.9
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc7 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 89.0
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc6 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 89.0
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc5 GlobalPointer raw-only + context labels 1.0000 1.0000 0.9333 84.5
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc4 GlobalPointer raw-only + context labels 0.9935 0.9935 0.9333 61.5
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc3 GlobalPointer raw-only + context labels 0.9935 0.9935 0.9333 61.5
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc2 GlobalPointer raw-only + context labels 0.9935 0.9935 0.9222 61.5
temsa/IrishCore-GlobalPointer-ContextPII-135M-v1-rc1 GlobalPointer raw-only + context labels 0.9935 0.9935 0.9222 61.5
temsa/IrishCore-GlobalPointer-135M-v1-rc4 GlobalPointer raw-only span-matrix 1.0000 1.0000 0.9333 221.6
temsa/IrishCore-GlobalPointer-135M-v1-rc3 GlobalPointer raw-only span-matrix 1.0000 1.0000 0.9213 204.9
temsa/IrishCore-GlobalPointer-135M-v1-rc2 GlobalPointer raw-only span-matrix 0.9934 0.9934 0.9326 231.2
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc8 Raw-only token-span 0.9737 0.9737 0.9176 46.1
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc7 Hybrid classifier + generated scanner spec 1.0000 0.9934 1.0000 30.0
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc6 Hybrid classifier + repair decoders 1.0000 0.9934 1.0000 29.5
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc5 Hybrid classifier + repair decoders 0.9737 0.9669 0.9333 34.4
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc4 Hybrid classifier + repair decoders 0.9870 0.9740 0.9600 114.2
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc3 Hybrid classifier + repair decoders 0.9806 0.9677 0.9333 44.9
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc2 Hybrid classifier + repair decoders 0.9554 0.9615 0.7887 119.1
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v1 Hybrid classifier baseline 0.9530 0.9333 0.9882 103.3
temsa/IrishCore-DiffMask-135M-v1-rc6 DiffMask token-span, scanner-free 0.9801 0.9733 0.9274 130.3
temsa/IrishCore-DiffMask-135M-v1-rc5 DiffMask token-span, scanner-free 0.9733 0.9733 0.9379 249.2
temsa/IrishCore-DiffMask-135M-v1-rc4 DiffMask token-span, scanner-free 0.9733 0.9733 0.9371 29.5
temsa/IrishCore-DiffMask-135M-v1-rc3 DiffMask token-span, scanner-free 0.9664 0.9664 0.9591 30.0
temsa/IrishCore-DiffMask-135M-v1-rc2 DiffMask token-span, scanner-free 0.9664 0.9664 0.9212 247.1
temsa/IrishCore-DiffMask-135M-v1-rc1 DiffMask token-span, scanner-free 0.9801 0.9934 0.9412 251.2

Irish Core PII: Other Public Checkpoints

Repo Stack Full Core F1 Q8 Core F1 Q8 Multilingual PPSN F1 Notes
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc1 Hybrid classifier prototype 0.9487 โ€” โ€” Predates the public q8 artifact.

Finance-boundary q8 F1 is 1.0000 for OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc6, OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc7, OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc8, and all public IrishCore-DiffMask releases from rc1 to rc6. OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc5 ships 0.8750 on that public q8 suite.

PPSN-Only: Comparable Public Artifacts

Repo Artifact Irish Large F1 Multilingual PPSN F1 User Raw F1 QA v8 F1 CPU ex/s
temsa/OpenMed-mLiteClinical-IrishPPSN-135M-v1 fp32 canonical checkpoint 0.8979 0.9704 0.8000 0.7385 57.4
temsa/OpenMed-mLiteClinical-IrishPPSN-135M-v1-fp16 fp16 CPU/GPU artifact โ€” 0.9704 0.8000 0.7385 45.8
temsa/OpenMed-mLiteClinical-IrishPPSN-135M-v1-q8 dynamic int8 CPU artifact โ€” 0.9040 โ€” โ€” 132.1

PPSN-Only: Historical Public Checkpoints

Repo Main Published Metrics Notes
temsa/OpenMed-PPSN-mLiteClinical-v1 same as canonical fp32 repo: multilingual 0.9704, user raw 0.8000 Legacy alias; prefer temsa/OpenMed-mLiteClinical-IrishPPSN-135M-v1.
temsa/OpenMed-PPSN-v6-raw-rc2 irish_reg_v5 0.8750; user_raw 0.8000; qa_v8 0.7385 Raw PPSN-only research checkpoint; no packaged multilingual CPU benchmark row.
temsa/OpenMed-PPSN-v5_1 irish_large_v2 raw 0.9285; qa_v6 hybrid strict 1.0000 Hybrid PPSN-only checkpoint; predates the canonical multilingual suite packaging.
temsa/OpenMed-PPSN-v5 irish_reg_v5 raw 0.8235; irish_reg_v5 hybrid strict 1.0000 Hybrid PPSN-only checkpoint; predates the canonical multilingual suite packaging.
temsa/OpenMed-PPSN-v4 synthetic non-PPSN drift check only Predates the current PPSN eval suite; no packaged apples-to-apples multilingual CPU row.

If you need the strongest current raw-only Irish core model, start with IrishCore-GlobalPointer-135M-v1-rc4. If you need the fastest CPU-first raw-only line, compare it against IrishCore-DiffMask-135M-v1-rc6. If you need a PPSN-only artifact, compare the canonical fp32, fp16, and q8 variants of OpenMed-mLiteClinical-IrishPPSN-135M-v1 directly in the table above.

Downloads last month
6
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support