| language: he | |
| tags: | |
| - token-classification | |
| - hebrew | |
| - recipe | |
| - ner | |
| license: mit | |
| # Hebrew Recipe Modification NER | |
| DictaBERT-large fine-tuned for recipe modification extraction from Hebrew YouTube comments. | |
| Trained with class weighting (P1) on silver labels from a 3-pass LLM teacher pipeline (v2). | |
| ## Labels | |
| - `B/I-SUBSTITUTION` โ ingredient substitution | |
| - `B/I-ADDITION` โ ingredient addition | |
| - `B/I-QUANTITY` โ quantity change | |
| - `B/I-TECHNIQUE` โ technique change | |
| ## Usage | |
| ```python | |
| from transformers import pipeline | |
| pipe = pipeline("token-classification", | |
| model="DanielDDDS/hebrew-recipe-modification-ner", | |
| aggregation_strategy="simple") | |
| pipe("ืืคืฉืจ ืืืืืืฃ ืืืื ืืฉืื ืงืืงืืก") | |
| ``` | |
| ## Performance (corrected gold test set, n=496, 38 spans) | |
| - Exact Entity F1: 25.5% | |
| - Relaxed Entity F1: 62.6% | |
| - Model: DictaBERT-large + linear head, class weights (P1) | |
| - Beats LLM teacher on relaxed F1 (teacher: 48.4%) | |