A Named Entity Based Approach to Model Recipes
Paper โข 2004.12184 โข Published
This model is a fine-tuned version of xlm-roberta-base on the recipe ingredient NER dataset from the paper A Named Entity Based Approach to Model Recipes (using both the gk and ar datasets).
It achieves the following results on the evaluation set:
On the test set it obtains an F1 of 0.9615, slightly above the CRF used in the paper.
Predicts tag of each token in an ingredient string.
| Tag | Significance | Example |
|---|---|---|
| NAME | Name of Ingredient | salt, pepper |
| STATE | Processing State of Ingredient. | ground, thawed |
| UNIT | Measuring unit(s). | gram, cup |
| QUANTITY | Quantity associated with the unit(s). | 1, 1 1/2 , 2-4 |
| SIZE | Portion sizes mentioned. | small, large |
| TEMP | Temperature applied prior to cooking. | hot, frozen |
| DF (DRY/FRESH) | Fresh otherwise as mentioned. | dry, fresh |
Both the ar (AllRecipes.com) and gk (FOOD.com) datasets obtained from the TSVs from the authors' repository.
It follows the overall procedure from Chapter 4 of Natural Language Processing with Transformers by Tunstall, von Wera and Wolf.
See the training notebook for details.
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | F1 |
|---|---|---|---|---|
| 0.2529 | 1.0 | 331 | 0.1303 | 0.9592 |
| 0.1164 | 2.0 | 662 | 0.1224 | 0.9640 |
| 0.0904 | 3.0 | 993 | 0.1156 | 0.9671 |
| 0.0585 | 4.0 | 1324 | 0.1169 | 0.9672 |