Clean upload: dictabert-large P1, relaxed F1=62.6%

c84096a verified 14 days ago

977 Bytes

	---
	language: he
	tags:
	- token-classification
	- hebrew
	- recipe
	- ner
	license: mit
	---

	# Hebrew Recipe Modification NER

	DictaBERT-large fine-tuned for recipe modification extraction from Hebrew YouTube comments.
	Trained with class weighting (P1) on silver labels from a 3-pass LLM teacher pipeline (v2).

	## Labels
	- `B/I-SUBSTITUTION` — ingredient substitution
	- `B/I-ADDITION` — ingredient addition
	- `B/I-QUANTITY` — quantity change
	- `B/I-TECHNIQUE` — technique change

	## Usage
	```python
	from transformers import pipeline
	pipe = pipeline("token-classification",
	model="DanielDDDS/hebrew-recipe-modification-ner",
	aggregation_strategy="simple")
	pipe("אפשר להחליף חמאה בשמן קוקוס")
	```

	## Performance (corrected gold test set, n=496, 38 spans)
	- Exact Entity F1: 25.5%
	- Relaxed Entity F1: 62.6%
	- Model: DictaBERT-large + linear head, class weights (P1)
	- Beats LLM teacher on relaxed F1 (teacher: 48.4%)