| # De-identification Benchmark Results |
| **Model:** Minibase-DeId-Small |
| **Dataset:** Personal_De-identifier_Benchmark_SFT.jsonl |
| **Sample Size:** 100 |
| **Date:** 2025-09-25T12:48:06.242738 |
|
|
| ## Overall Performance |
|
|
| | Metric | Score | Description | |
| |--------|-------|-------------| |
| | PII Detection Rate | 1.000 | How well personal identifiers are detected | |
| | Completeness Score | 0.650 | Percentage of texts fully de-identified | |
| | Semantic Preservation | 0.811 | How well meaning is preserved | |
| | Average Latency | 477.0ms | Response time performance | |
|
|
| ## Key Improvements |
|
|
| - **PII Detection**: Now measures if model generates ANY placeholders when PII is present in input |
| - **Unified Evaluation**: All examples evaluated together (no domain separation) |
| - **Lenient Scoring**: Focuses on detection capability rather than exact placeholder matching |
|
|
| ## Example Results |
|
|
| ### Example 1 |
| **Input:** Patient Sarah Johnson, DOB 05/12/1980, visited Dr. Lee at St. Jude Hospital on 2023-10-26. Her conta... |
| **Expected:** Patient [NAME_1], DOB [DOB_1], visited [NAME_2] at [HOSPITAL_1] on [DATE_1]. Her contact is [PHONE_1... |
| **Predicted:** Patient [FIRSTNAME_1] [MIDDLENAME_1], DOB [DOB_1], visited Dr. [LASTNAME_1] at [CITY_1] Hospital on ... |
| **PII Detection:** 1.000 |
|
|
| ### Example 2 |
| **Input:** Deponent Mr. Robert Davis, CEO of GlobalCorp Inc., stated under oath on December 1, 2022, that his a... |
| **Expected:** Deponent [NAME_1], CEO of [ORGANIZATION_1], stated under oath on [DATE_1], that his attorney, [NAME_... |
| **Predicted:** Deponent [PREFIX_1] [FIRSTNAME_1] [LASTNAME_1], CEO of [COMPANYNAME_1], stated under oath on [DATE_1... |
| **PII Detection:** 1.000 |
|
|
| ### Example 3 |
| **Input:** Employee ID: EMP-001-XYZ. Name: John Doe. Salary: $85,000. Email: john.doe@example.com. Marital Stat... |
| **Expected:** Employee ID: [EMPLOYEE_ID_1]. Name: [NAME_1]. Salary: [SALARY_1]. Email: [EMAIL_1]. Marital Status: ... |
| **Predicted:** Employee ID: EMP-[CREDITCARDCVV_1]. Name: [FIRSTNAME_1] Doe. Salary: [CURRENCYSYMBOL_1][AMOUNT_1]. E... |
| **PII Detection:** 1.000 |
|
|
|
|