metaphor-cat-mdeberta-weights

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the Catalan metaphor detection dataset metaphor-catalan.

It achieves the following results on the evaluation set:

  • Loss: 0.5918
  • Precision: 0.5870
  • Recall: 0.6429
  • F1: 0.6136
  • Accuracy: 0.9593

Model description

This model is a DeBERTa v3-based transformer fine-tuned for token-level metaphor detection in Catalan.

The model identifies tokens that are part of metaphorical expressions using a BIO tagging scheme:

  • O – non-metaphorical token
  • B-METAPHOR – beginning of a metaphorical expression
  • I-METAPHOR – continuation of a metaphorical expression

The model was fine-tuned using class-weighted loss to address class imbalance in the dataset, where metaphor tokens are significantly less frequent than literal tokens.

This model is suitable for research and experimentation in figurative language detection, computational linguistics, and Catalan NLP tasks.

Intended uses & limitations

Intended uses:

  • Detecting metaphorical expressions in Catalan text.
  • Supporting linguistic research on figurative language.
  • Assisting annotation pipelines for metaphor datasets.
  • Integrating into Catalan NLP pipelines that require semantic analysis.

Limitations:

  • The dataset is relatively small and domain-limited.
  • The model may not generalize well to unseen domains such as social media or informal text.
  • Predictions are performed at the token level, so additional processing may be required to reconstruct full metaphor spans.
  • Although class weighting helps address imbalance, metaphor detection remains a challenging task and some expressions may still be missed.

Training and evaluation data

Training dataset: metaphor-catalan

The dataset contains Catalan sentences annotated for metaphorical language using token-level labels.

Example annotation structure:

  • tokens: tokenized sentence
  • tags: BIO labels indicating metaphor spans

Label set used during training:

  • O
  • B-METAPHOR
  • I-METAPHOR

To mitigate strong label imbalance, class weights were applied during training.

Training procedure

Hyperparameters

  • Learning rate: 2e-05
  • Train batch size: 8
  • Evaluation batch size: 8
  • Optimizer: AdamW (fused, betas=(0.9,0.999), epsilon=1e-08)
  • LR scheduler: linear
  • Warmup steps: 50
  • Epochs: 6
  • Mixed precision: Native AMP

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
No log 1.0 66 0.5138 0.1255 0.7143 0.2135 0.7356
0.7433 2.0 132 0.3932 0.2897 0.7381 0.4161 0.8959
0.7433 3.0 198 0.4049 0.3974 0.7381 0.5167 0.9306
0.2768 4.0 264 0.5224 0.5833 0.6667 0.6222 0.9593
0.1325 5.0 330 0.4865 0.5439 0.7381 0.6263 0.9557
0.1325 6.0 396 0.5918 0.5870 0.6429 0.6136 0.9593

Framework versions

  • Transformers: 4.57.3
  • PyTorch: 2.9.0+cu126
  • Datasets: 4.0.0
  • Tokenizers: 0.22.1
Downloads last month
3
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mariadelcarmenramirez/metaphor-cat-mdeberta-weights

Finetuned
(262)
this model

Dataset used to train mariadelcarmenramirez/metaphor-cat-mdeberta-weights