Text Ranking
sentence-transformers
Safetensors
Amharic
xlm-roberta
cross-encoder
Generated from Trainer
dataset_size:491752
loss:BinaryCrossEntropyLoss
Eval Results (legacy)
text-embeddings-inference
Instructions to use rasyosef/reranker-amharic-medium with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use rasyosef/reranker-amharic-medium with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("rasyosef/reranker-amharic-medium") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
Improve model card: add paper, code links and citation
#2
by nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,20 +1,22 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
language:
|
| 3 |
- am
|
|
|
|
| 4 |
license: mit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5 |
tags:
|
| 6 |
- sentence-transformers
|
| 7 |
- cross-encoder
|
| 8 |
- generated_from_trainer
|
| 9 |
- dataset_size:491752
|
| 10 |
- loss:BinaryCrossEntropyLoss
|
| 11 |
-
base_model: rasyosef/roberta-medium-amharic
|
| 12 |
-
pipeline_tag: text-ranking
|
| 13 |
-
library_name: sentence-transformers
|
| 14 |
-
metrics:
|
| 15 |
-
- map
|
| 16 |
-
- mrr@10
|
| 17 |
-
- ndcg@10
|
| 18 |
model-index:
|
| 19 |
- name: roberta-amharic-reranker-medium
|
| 20 |
results:
|
|
@@ -31,32 +33,32 @@ model-index:
|
|
| 31 |
- type: ndcg@10
|
| 32 |
value: 0.835
|
| 33 |
name: Ndcg@10
|
| 34 |
-
|
| 35 |
-
datasets:
|
| 36 |
-
- rasyosef/Amharic-Passage-Retrieval-Dataset-V2
|
| 37 |
---
|
| 38 |
|
| 39 |
# reranker-amharic-medium
|
| 40 |
|
| 41 |
This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [rasyosef/roberta-medium-amharic](https://huggingface.co/rasyosef/roberta-medium-amharic) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
|
| 42 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
## Model Details
|
| 44 |
|
| 45 |
### Model Description
|
| 46 |
- **Model Type:** Cross Encoder
|
| 47 |
-
- **Base model:** [rasyosef/roberta-medium-amharic](https://huggingface.co/rasyosef/roberta-medium-amharic)
|
| 48 |
- **Maximum Sequence Length:** 510 tokens
|
| 49 |
- **Number of Output Labels:** 1 label
|
| 50 |
-
|
| 51 |
-
- **
|
| 52 |
-
- **License:** mit
|
| 53 |
|
| 54 |
### Model Sources
|
| 55 |
|
| 56 |
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
|
| 57 |
- **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
|
| 58 |
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
|
| 59 |
-
- **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
|
| 60 |
|
| 61 |
## Usage
|
| 62 |
|
|
@@ -74,10 +76,11 @@ from sentence_transformers import CrossEncoder
|
|
| 74 |
|
| 75 |
# Download from the 🤗 Hub
|
| 76 |
model = CrossEncoder("rasyosef/reranker-amharic-medium")
|
|
|
|
| 77 |
# Get scores for pairs of texts
|
| 78 |
pairs = [
|
| 79 |
-
['ለውጭ ገበያ በሚቀርበው የኢትዮጵያ ቡና ላይ የተጋረጠው ፈተና', 'የኢትዮጵያ ዋነኛ የውጭ ምንዛሬ ምንጭ የሆነው ወደ ውጭ የሚላክ ቡና ዘርፍ በአሁኑ ጊዜ ከፍተኛ ውጥረት ውስጥ ገብቷል።
|
| 80 |
-
['ለውጭ ገበያ በሚቀርበው የኢትዮጵያ ቡና ላይ የተጋረጠው ፈተና', 'የቻይናው ፕሬዝዳንት ዚ ጂንፒንግ ከትራምፕ ጋር ባደረጉት ጉባኤ ትኩረታቸው በሁለቱ ሀገራት መካከል ለወራት ከተፈጠረ ውጥረት እና የንግድ ጦርነት በኋላ የተረገጋጋ ግንኙነትን ማስቀጠል ነበር።
|
| 81 |
]
|
| 82 |
scores = model.predict(pairs)
|
| 83 |
print(scores.shape)
|
|
@@ -87,37 +90,14 @@ print(scores.shape)
|
|
| 87 |
ranks = model.rank(
|
| 88 |
'ለውጭ ገበያ በሚቀርበው የኢትዮጵያ ቡና ላይ የተጋረጠው ፈተና',
|
| 89 |
[
|
| 90 |
-
'የኢትዮጵያ ዋነኛ የውጭ ምንዛሬ ምንጭ የሆነው ወደ ውጭ የሚላክ ቡና ዘርፍ በአሁኑ ጊዜ ከፍተኛ ውጥረት ውስጥ ገብቷል።
|
| 91 |
-
'የቻይናው ፕሬዝዳንት ዚ ጂንፒንግ ከትራምፕ ጋር ባደረጉት ጉባኤ ትኩረታቸው በሁለቱ ሀገራት መካከል ለወራት ከተፈጠረ ውጥረት እና የንግድ ጦርነት በኋላ የተረገጋጋ ግንኙነትን ማስቀጠል ነበር።
|
| 92 |
]
|
| 93 |
)
|
| 94 |
-
|
|
|
|
| 95 |
```
|
| 96 |
|
| 97 |
-
<!--
|
| 98 |
-
### Direct Usage (Transformers)
|
| 99 |
-
|
| 100 |
-
<details><summary>Click to see the direct usage in Transformers</summary>
|
| 101 |
-
|
| 102 |
-
</details>
|
| 103 |
-
-->
|
| 104 |
-
|
| 105 |
-
<!--
|
| 106 |
-
### Downstream Usage (Sentence Transformers)
|
| 107 |
-
|
| 108 |
-
You can finetune this model on your own dataset.
|
| 109 |
-
|
| 110 |
-
<details><summary>Click to expand</summary>
|
| 111 |
-
|
| 112 |
-
</details>
|
| 113 |
-
-->
|
| 114 |
-
|
| 115 |
-
<!--
|
| 116 |
-
### Out-of-Scope Use
|
| 117 |
-
|
| 118 |
-
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
|
| 119 |
-
-->
|
| 120 |
-
|
| 121 |
## Evaluation
|
| 122 |
|
| 123 |
### Metrics
|
|
@@ -137,39 +117,16 @@ You can finetune this model on your own dataset.
|
|
| 137 |
| mrr@10 | 0.805 |
|
| 138 |
| **ndcg@10** | **0.835** |
|
| 139 |
|
| 140 |
-
<!--
|
| 141 |
-
## Bias, Risks and Limitations
|
| 142 |
-
|
| 143 |
-
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
|
| 144 |
-
-->
|
| 145 |
-
|
| 146 |
-
<!--
|
| 147 |
-
### Recommendations
|
| 148 |
-
|
| 149 |
-
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
|
| 150 |
-
-->
|
| 151 |
-
|
| 152 |
## Training Details
|
| 153 |
|
| 154 |
<details>
|
| 155 |
|
| 156 |
### Training Dataset
|
| 157 |
|
| 158 |
-
####
|
| 159 |
|
| 160 |
* Size: 491,752 training samples
|
| 161 |
* Columns: <code>query</code>, <code>passage</code>, and <code>label</code>
|
| 162 |
-
* Approximate statistics based on the first 1000 samples:
|
| 163 |
-
| | query | passage | label |
|
| 164 |
-
|:--------|:-----------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------|:------------------------------------------------|
|
| 165 |
-
| type | string | string | int |
|
| 166 |
-
| details | <ul><li>min: 2 characters</li><li>mean: 49.94 characters</li><li>max: 283 characters</li></ul> | <ul><li>min: 126 characters</li><li>mean: 1418.88 characters</li><li>max: 8678 characters</li></ul> | <ul><li>0: ~87.40%</li><li>1: ~12.60%</li></ul> |
|
| 167 |
-
* Samples:
|
| 168 |
-
| query | passage | label |
|
| 169 |
-
|:------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
|
| 170 |
-
| <code>በባሌ፣ ቦረና እና ጉጂ ዞኖች የተከሰተውን የበርሃ አንበጣ ለመከላከል ተጨማሪ አውሮፕላኖች ወደ ስፍራው ይሰማራሉ</code> | <code>አዲስ አበባ ፣ ታህሳስ 27 ፣ 2012 (ኤፍ ቢ ሲ) የጃፓኑ ጠቅላይ ሚኒስትር ሺንዞ አቤ በመካከለኛው ምስራቅ ሃይል የማስፈር እቅድ እንዳላቸው በድጋሚ ገለጹ።ጠቅላይ ሚኒስትሩ በአካባቢው የሚንቀሳቀሱ የጃፓን መርከቦችን ደህንነት ለማረጋገጥ በስፍራው ሃይል የማስፈር እቅድ እንዳላቸው ገልጸዋል።ባለፈው ወር ጃፓን ወደ መካከለኛው ምስራቅ የጦር መርከቦችን እና ቃኝ አውሮፕላኖችን እንደምትልክ ገልጻ ነበር።የሃገሪቱ መከላከያ ሚኒስቴርም ቃኝ አውሮፕላኖቹ በተያዘው የፈረንጆቹ ጥር ወር ወደ ስፍራው እንደሚያቀኑ ገልጿል።የካቲት ወር ላይ ደግሞ የጦር መርከቦችን ወደ ስፍራው አንቀሳቅሳለሁ ብሏል።የአሁኑ የቶኪዮ እቅድ በመካከለኛው ምስራቅ የባህር ክልል የሚንቀሳቀሱ የጃፓን መርከቦችን ከጥቃት ለመከላከልና ደህንነታቸውን ለማረጋገጥ ያለመ ነው ተብሏል።አቤ በንግግራቸው በመካከለኛው ምስራቅ ያለው ወቅታዊ ሁኔታ እንዳሳሰባቸው ጠቅሰው፥ ሃገራትም አላስፈላጊ ውጥረትን እንዲያስወግዱ ጥሪ አቅርበዋል።አሜሪካ ባለፈው ዓርብ የኢራን ብሄራዊ አብዮት ዘብ ጠባቂ ሃይል አዛዥን በባግዳድ አውሮፕላን ማረፊያ ከገደለች በኋላ በመካከለኛው ምስራቅ ውጥረት ነግሷል።ኢራን ለአሜሪካ እርምጃ ከባድ አፀፋዊ ምላሽ እሰጣለሁ ስትል፥ የአሜሪካው ፕሬዚዳንት ዶናልድ ትራምፕም አሜሪካ የከፋ እርምጃ እንደምትወስድ አስጠንቅቀዋል።ምንጭ፦ ሬውተርስ</code> | <code>0</code> |
|
| 171 |
-
| <code>ወጣቱ ምንጫቸው ባልተረጋገጠ የማኅበራዊ ሚዲያ መረጃዎች ላይ በመጠመዱ የንባብ ባህሉ መቀነሱን የእንጅባራ ከተማ ነዋሪዎቸ ተናገሩ፡፡</code> | <code>ባሕር ዳር፡ ግንቦት 21/2012 ዓ.ም (አብመድ) የኮሮና ቫይረስ ወረርሽኝ የትምህርት ተቋማት ተማሪዎቻቸውን እንዲበትኑ አስገድዷቸዋል፡፡ተማሪዎቹን ከትምህርት ገበታቸው ማስተጓጎሉ አሉታዊ ተፅዕኖው የከፋ ቢሆንም ስለወረርሽኑ ግንዘቤ በመፍጠር ረገድ ወደ መልካም ዕድል እየቀየሩት ያሉ አሉ፡፡ወደ ሰሜን ሸዋ ዞን በረኸት ወረዳ ባቀናንበት ወቅት ያገኘናቸው ከተለያዩ የሀገሪቱ አቅጣጫዎች ወደ ቤተሰቦቻቸው የተመለሱ ተማሪዎች እጃቸውን አጣጥፈው አልተቀመጡም፡፡ ተማሪዎቹ ለኅብረተሰቡ ስለኮሮና ቫይረስ ወረርሽኝ የሚያወቁትን እያሳወቁ ነው፡፡ተማሪ ሄኖክ ወርቁ በወላይታ ሶዶ ዩኒቨርሲቲ የሦስተኛ ዓመት የጋዜጠኝነት እና ሥነ ተግባቦት ትምህርት ክፍል ተማሪ ነው፡፡ ሄኖክ ወደ ትውልድ ቀዬው ከተመለሰ ጊዜ ጀምሮ የተለያዩ የመገናኛ ዘዴዎችን በመጠቀም ስለኮሮና ቫይረ��� ወረርሽኝ ቅድመ መከላከል ከመንግሥት እና ከጤና ባለሙያዎች የሚወጡ መልእክቶችን ለኅብረተሰቡ እያስገነዘበ ነው፡፡ የግንዛቤ ፈጠራውን በ‘ሚኒ ሚዲያ’፣ በገበያ እና ሰዎች በሚሰባሰቡባቸው ቦታዎች በመገኘት ከጓደኞቹ ጋር እንደሚሠሩም ተናግሯል፡፡ ከግንዛቤ ፈጠራ ጎን ለጎን ደግሞ የዚህ ዓመት ተመራቂ ተማሪ እንደመሆኑ መጠን ጥናታዊ ጽሑፉን እየሠራ ጊዜውን በአግባባቡ እየተጠቀመ እንደሚገኝ ገልጿል፡፡ሌላኛው ያነጋገርነው ተማሪ አብርሃም ገብረኪዳን በወላይታ ሶዶ ዩኒቨርሲቲ ሦስተኛ ዓመት የሕግ ተማሪ ነው፡፡ ኅብረተሰቡ ለኮሮና ቫይረስ ወረርሽኝ እንዳይጋለጥ ሰፈር ለሰፈር፣ በገበያ ቀን ከወረዳው መዲና መተህብላ ከተማ መግቢያና መውጫ አካባቢዎች እጅ እንዲታጠቡ ከማድረግ ጀምሮ የወረርሽኙን ቅድመ መከላከል መልእክቶች በድምጽ ማጉያ (ሞንታርቦ) ተጠቅመው እያስተላለፉ እንደሆነ ተናግሯል፡፡ ተማሪዎቹ በሚያደርጉት የቅስቀሳ ግንዛቤ ማስጨበጫ ሥ...</code> | <code>0</code> |
|
| 172 |
-
| <code>አዳማ ከተማ ከ ኢትዮጵያ ቡና – ቀጥታ የፅሁፍ ስርጭት</code> | <code>79′ አዲስ ግደይተጠናቀቀ!ጨዋታው በሲዳማ ቡና አሸናፊነት ተጠናቀቀ፡፡ ሲዳ በድቻ ላይ ያለውን የበላይነት ሲያከብር ዘንድሮ በሜዳው ያለውን 100% ሪኮርድም አስጠብቋል፡፡ተጨማሪ ደቂቃ – 4 ደቂቃቢጫ ካርድ88′ ዳግም በቀለ አዲስ ግደይ ላይ በሰራው ጥፋት ቢጫ ካርድ ተመልክቷል፡፡ በሁኔታውም ለአለም ብርሃኑ አላስፈላጊ ድርጊት በመፈፀሙ ቢጫ ተመልክቷል፡፡84′ ዳግም በቀለ ከማዕዘን የተሻማውን ኳስ በግንባሩ ገጭቶ ለጥቂት ወጣበት፡፡ የሚያስቆጭ አጋጣሚ !የተጫዋቸ ለውጥ – ሲዳማ ቡና81′ በረከት አዲሱ ወጥቶ ሙጃይድ መሃመድ ገብቷል፡፡የተጫዋች ለውጥ – ወላይታ ድቻ አናጋው ባደግ ወጥቶ አብዱልሰመድ አሊ ገብቷል፡፡ጎልልል!!! ሲዳማ ቡና79′ አዲስ ግደይ ከኤሪክ ሙራንዳ የተሻገረለትን ኳስ በግንባሩ ገጭቶ ወደ ግብነት በመቀየር ሲዳማን መሪ አድርጓል፡፡77′ በዛብህ መለዮ ከርቀት በግራ እግሩ መሬት ለመሬት አክርሮ የመታው ኳስ ለጥቂት ወጣ፡፡<br>የተጫዋች ለውጥ – ወላይታ ድቻ 71′ ቴዎድሮስ መንገሻ ወጥሆ ዳግም በቀለ ገብቷል፡፡<br>የተጫዋች ለውጥ – ሲዳማ ቡና71′ አንተነህ ተስፋዬ በጉዳት ወጥቶ ላኪም ሳኒ ገብቷል፡፡65′ በድጋሚ ከመስመር የተሻገረውን ኳስ ኤሪክ ሙራዳ በግንባሩ ገጭቶ የግቡ አግዳሚ መልሶበታል፡፡ ሲዳማ ቡና ጫና ፈጥሮ በማጥቃት ላይ ይገኛል፡፡63′ ከግራ መስመር ወሰኑ ማዜ ያሻማውን ኳስ አዲስ ግደይ በግንባሩ ገጭቶ የግቡን አግዳሚ ታኮ ወጥቷል፡፡የተጫዋች ለውጥ – ወላይታ ድቻ 60′ አማኑኤል ተሾመ ወጥቶ መሳይ አጪሶ ገብቷል፡፡53′ አናጋው ባደግ ከግራ መስመር ያሻገረውን ኳስ በዛብህ መለዮ አገባው ሲባል በግቡ አናት ሰደደው፡፡ የሚያስቆጭ አጋጣሚ!የተጫዋች ለውጥ – ሲዳማ<br>46′ ግሩም አሰፋ ወጥቶ ኤሪክ ሙራንዳ ገብቷል፡፡<br>ተጀመረ!<br>ሁለተኛው አጋማሽ የጨዋታ...</code> | <code>0</code> |
|
| 173 |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
|
| 174 |
```json
|
| 175 |
{
|
|
@@ -193,134 +150,13 @@ You can finetune this model on your own dataset.
|
|
| 193 |
- `load_best_model_at_end`: True
|
| 194 |
- `batch_sampler`: no_duplicates
|
| 195 |
|
| 196 |
-
#### All Hyperparameters
|
| 197 |
-
<details><summary>Click to expand</summary>
|
| 198 |
-
|
| 199 |
-
- `overwrite_output_dir`: False
|
| 200 |
-
- `do_predict`: False
|
| 201 |
-
- `eval_strategy`: epoch
|
| 202 |
-
- `prediction_loss_only`: True
|
| 203 |
-
- `per_device_train_batch_size`: 64
|
| 204 |
-
- `per_device_eval_batch_size`: 64
|
| 205 |
-
- `per_gpu_train_batch_size`: None
|
| 206 |
-
- `per_gpu_eval_batch_size`: None
|
| 207 |
-
- `gradient_accumulation_steps`: 1
|
| 208 |
-
- `eval_accumulation_steps`: None
|
| 209 |
-
- `torch_empty_cache_steps`: None
|
| 210 |
-
- `learning_rate`: 4e-05
|
| 211 |
-
- `weight_decay`: 0.0
|
| 212 |
-
- `adam_beta1`: 0.9
|
| 213 |
-
- `adam_beta2`: 0.999
|
| 214 |
-
- `adam_epsilon`: 1e-08
|
| 215 |
-
- `max_grad_norm`: 1.0
|
| 216 |
-
- `num_train_epochs`: 4
|
| 217 |
-
- `max_steps`: -1
|
| 218 |
-
- `lr_scheduler_type`: cosine
|
| 219 |
-
- `lr_scheduler_kwargs`: {}
|
| 220 |
-
- `warmup_ratio`: 0.05
|
| 221 |
-
- `warmup_steps`: 0
|
| 222 |
-
- `log_level`: passive
|
| 223 |
-
- `log_level_replica`: warning
|
| 224 |
-
- `log_on_each_node`: True
|
| 225 |
-
- `logging_nan_inf_filter`: True
|
| 226 |
-
- `save_safetensors`: True
|
| 227 |
-
- `save_on_each_node`: False
|
| 228 |
-
- `save_only_model`: False
|
| 229 |
-
- `restore_callback_states_from_checkpoint`: False
|
| 230 |
-
- `no_cuda`: False
|
| 231 |
-
- `use_cpu`: False
|
| 232 |
-
- `use_mps_device`: False
|
| 233 |
-
- `seed`: 42
|
| 234 |
-
- `data_seed`: None
|
| 235 |
-
- `jit_mode_eval`: False
|
| 236 |
-
- `use_ipex`: False
|
| 237 |
-
- `bf16`: False
|
| 238 |
-
- `fp16`: True
|
| 239 |
-
- `fp16_opt_level`: O1
|
| 240 |
-
- `half_precision_backend`: auto
|
| 241 |
-
- `bf16_full_eval`: False
|
| 242 |
-
- `fp16_full_eval`: False
|
| 243 |
-
- `tf32`: None
|
| 244 |
-
- `local_rank`: 0
|
| 245 |
-
- `ddp_backend`: None
|
| 246 |
-
- `tpu_num_cores`: None
|
| 247 |
-
- `tpu_metrics_debug`: False
|
| 248 |
-
- `debug`: []
|
| 249 |
-
- `dataloader_drop_last`: False
|
| 250 |
-
- `dataloader_num_workers`: 2
|
| 251 |
-
- `dataloader_prefetch_factor`: None
|
| 252 |
-
- `past_index`: -1
|
| 253 |
-
- `disable_tqdm`: False
|
| 254 |
-
- `remove_unused_columns`: True
|
| 255 |
-
- `label_names`: None
|
| 256 |
-
- `load_best_model_at_end`: True
|
| 257 |
-
- `ignore_data_skip`: False
|
| 258 |
-
- `fsdp`: []
|
| 259 |
-
- `fsdp_min_num_params`: 0
|
| 260 |
-
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
|
| 261 |
-
- `fsdp_transformer_layer_cls_to_wrap`: None
|
| 262 |
-
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
|
| 263 |
-
- `deepspeed`: None
|
| 264 |
-
- `label_smoothing_factor`: 0.0
|
| 265 |
-
- `optim`: adamw_torch
|
| 266 |
-
- `optim_args`: None
|
| 267 |
-
- `adafactor`: False
|
| 268 |
-
- `group_by_length`: False
|
| 269 |
-
- `length_column_name`: length
|
| 270 |
-
- `ddp_find_unused_parameters`: None
|
| 271 |
-
- `ddp_bucket_cap_mb`: None
|
| 272 |
-
- `ddp_broadcast_buffers`: False
|
| 273 |
-
- `dataloader_pin_memory`: True
|
| 274 |
-
- `dataloader_persistent_workers`: False
|
| 275 |
-
- `skip_memory_metrics`: True
|
| 276 |
-
- `use_legacy_prediction_loop`: False
|
| 277 |
-
- `push_to_hub`: False
|
| 278 |
-
- `resume_from_checkpoint`: None
|
| 279 |
-
- `hub_model_id`: None
|
| 280 |
-
- `hub_strategy`: every_save
|
| 281 |
-
- `hub_private_repo`: None
|
| 282 |
-
- `hub_always_push`: False
|
| 283 |
-
- `gradient_checkpointing`: False
|
| 284 |
-
- `gradient_checkpointing_kwargs`: None
|
| 285 |
-
- `include_inputs_for_metrics`: False
|
| 286 |
-
- `include_for_metrics`: []
|
| 287 |
-
- `eval_do_concat_batches`: True
|
| 288 |
-
- `fp16_backend`: auto
|
| 289 |
-
- `push_to_hub_model_id`: None
|
| 290 |
-
- `push_to_hub_organization`: None
|
| 291 |
-
- `mp_parameters`:
|
| 292 |
-
- `auto_find_batch_size`: False
|
| 293 |
-
- `full_determinism`: False
|
| 294 |
-
- `torchdynamo`: None
|
| 295 |
-
- `ray_scope`: last
|
| 296 |
-
- `ddp_timeout`: 1800
|
| 297 |
-
- `torch_compile`: False
|
| 298 |
-
- `torch_compile_backend`: None
|
| 299 |
-
- `torch_compile_mode`: None
|
| 300 |
-
- `include_tokens_per_second`: False
|
| 301 |
-
- `include_num_input_tokens_seen`: False
|
| 302 |
-
- `neftune_noise_alpha`: None
|
| 303 |
-
- `optim_target_modules`: None
|
| 304 |
-
- `batch_eval_metrics`: False
|
| 305 |
-
- `eval_on_start`: False
|
| 306 |
-
- `use_liger_kernel`: False
|
| 307 |
-
- `eval_use_gather_object`: False
|
| 308 |
-
- `average_tokens_across_devices`: False
|
| 309 |
-
- `prompts`: None
|
| 310 |
-
- `batch_sampler`: no_duplicates
|
| 311 |
-
- `multi_dataset_batch_sampler`: proportional
|
| 312 |
-
|
| 313 |
-
</details>
|
| 314 |
-
|
| 315 |
### Training Logs
|
| 316 |
| Epoch | Step | Training Loss | amh-passage-retrieval-dev_ndcg@10 |
|
| 317 |
|:-------:|:---------:|:-------------:|:---------------------------------:|
|
| 318 |
-
| -1 | -1 | - | 0.0898 |
|
| 319 |
| 1.0 | 7684 | 0.4048 | 0.8289 |
|
| 320 |
| 2.0 | 15368 | 0.2366 | 0.8546 |
|
| 321 |
| 3.0 | 23052 | 0.1588 | 0.8353 |
|
| 322 |
| **4.0** | **30736** | **0.1024** | **0.8551** |
|
| 323 |
-
| -1 | -1 | - | 0.8579 |
|
| 324 |
|
| 325 |
* The bold row denotes the saved checkpoint.
|
| 326 |
|
|
@@ -337,21 +173,11 @@ You can finetune this model on your own dataset.
|
|
| 337 |
|
| 338 |
## Citation
|
| 339 |
|
| 340 |
-
|
| 341 |
-
|
| 342 |
-
|
| 343 |
-
|
| 344 |
-
|
| 345 |
-
|
| 346 |
-
|
| 347 |
-
|
| 348 |
-
## Model Card Authors
|
| 349 |
-
|
| 350 |
-
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
|
| 351 |
-
-->
|
| 352 |
-
|
| 353 |
-
<!--
|
| 354 |
-
## Model Card Contact
|
| 355 |
-
|
| 356 |
-
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
|
| 357 |
-
-->
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model: rasyosef/roberta-medium-amharic
|
| 3 |
+
datasets:
|
| 4 |
+
- rasyosef/Amharic-Passage-Retrieval-Dataset-V2
|
| 5 |
language:
|
| 6 |
- am
|
| 7 |
+
library_name: sentence-transformers
|
| 8 |
license: mit
|
| 9 |
+
metrics:
|
| 10 |
+
- map
|
| 11 |
+
- mrr@10
|
| 12 |
+
- ndcg@10
|
| 13 |
+
pipeline_tag: text-ranking
|
| 14 |
tags:
|
| 15 |
- sentence-transformers
|
| 16 |
- cross-encoder
|
| 17 |
- generated_from_trainer
|
| 18 |
- dataset_size:491752
|
| 19 |
- loss:BinaryCrossEntropyLoss
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
model-index:
|
| 21 |
- name: roberta-amharic-reranker-medium
|
| 22 |
results:
|
|
|
|
| 33 |
- type: ndcg@10
|
| 34 |
value: 0.835
|
| 35 |
name: Ndcg@10
|
|
|
|
|
|
|
|
|
|
| 36 |
---
|
| 37 |
|
| 38 |
# reranker-amharic-medium
|
| 39 |
|
| 40 |
This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [rasyosef/roberta-medium-amharic](https://huggingface.co/rasyosef/roberta-medium-amharic) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
|
| 41 |
|
| 42 |
+
This model is part of the research presented in the paper **"The Multilingual Curse at the Retrieval Layer: Evidence from Amharic"**.
|
| 43 |
+
|
| 44 |
+
- **Paper:** [The Multilingual Curse at the Retrieval Layer: Evidence from Amharic](https://huggingface.co/papers/2605.24556)
|
| 45 |
+
- **Code:** [https://github.com/rasyosef/amharic-neural-ir](https://github.com/rasyosef/amharic-neural-ir)
|
| 46 |
+
|
| 47 |
## Model Details
|
| 48 |
|
| 49 |
### Model Description
|
| 50 |
- **Model Type:** Cross Encoder
|
| 51 |
+
- **Base model:** [rasyosef/roberta-medium-amharic](https://huggingface.co/rasyosef/roberta-medium-amharic)
|
| 52 |
- **Maximum Sequence Length:** 510 tokens
|
| 53 |
- **Number of Output Labels:** 1 label
|
| 54 |
+
- **Language:** Amharic (am)
|
| 55 |
+
- **License:** MIT
|
|
|
|
| 56 |
|
| 57 |
### Model Sources
|
| 58 |
|
| 59 |
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
|
| 60 |
- **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
|
| 61 |
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
|
|
|
|
| 62 |
|
| 63 |
## Usage
|
| 64 |
|
|
|
|
| 76 |
|
| 77 |
# Download from the 🤗 Hub
|
| 78 |
model = CrossEncoder("rasyosef/reranker-amharic-medium")
|
| 79 |
+
|
| 80 |
# Get scores for pairs of texts
|
| 81 |
pairs = [
|
| 82 |
+
['ለውጭ ገበያ በሚቀርበው የኢትዮጵያ ቡና ላይ የተጋረጠው ፈተና', 'የኢትዮጵያ ዋነኛ የውጭ ምንዛሬ ምንጭ የሆነው ወደ ውጭ የሚላክ ቡና ዘርፍ በአሁኑ ጊዜ ከፍተኛ ውጥረት ውስጥ ገብቷል።'],
|
| 83 |
+
['ለውጭ ገበያ በሚቀርበው የኢትዮጵያ ቡና ላይ የተጋረጠው ፈተና', 'የቻይናው ፕሬዝዳንት ዚ ጂንፒንግ ከትራምፕ ጋር ባደረጉት ጉባኤ ትኩረታቸው በሁለቱ ሀገራት መካከል ለወራት ከተፈጠረ ውጥረት እና የንግድ ጦርነት በኋላ የተረገጋጋ ግንኙነትን ማስቀጠል ነበር።']
|
| 84 |
]
|
| 85 |
scores = model.predict(pairs)
|
| 86 |
print(scores.shape)
|
|
|
|
| 90 |
ranks = model.rank(
|
| 91 |
'ለውጭ ገበያ በሚቀርበው የኢትዮጵያ ቡና ላይ የተጋረጠው ፈተና',
|
| 92 |
[
|
| 93 |
+
'የኢትዮጵያ ዋነኛ የውጭ ምንዛሬ ምንጭ የሆነው ወደ ውጭ የሚላክ ቡና ዘርፍ በአሁኑ ጊዜ ከፍተኛ ውጥረት ውስጥ ገብቷል።',
|
| 94 |
+
'የቻይናው ፕሬዝዳንት ዚ ጂንፒንግ ከትራምፕ ጋር ባደረጉት ጉባኤ ትኩረታቸው በሁለቱ ሀገራት መካከል ለወራት ከተፈጠረ ውጥረት እና የንግድ ጦርነት በኋላ የተረገጋጋ ግንኙነትን ማስቀጠል ነበር።',
|
| 95 |
]
|
| 96 |
)
|
| 97 |
+
print(ranks)
|
| 98 |
+
# [{'corpus_id': 0, 'score': ...}, {'corpus_id': 1, 'score': ...}]
|
| 99 |
```
|
| 100 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 101 |
## Evaluation
|
| 102 |
|
| 103 |
### Metrics
|
|
|
|
| 117 |
| mrr@10 | 0.805 |
|
| 118 |
| **ndcg@10** | **0.835** |
|
| 119 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 120 |
## Training Details
|
| 121 |
|
| 122 |
<details>
|
| 123 |
|
| 124 |
### Training Dataset
|
| 125 |
|
| 126 |
+
#### Amharic Passage Retrieval Dataset V2
|
| 127 |
|
| 128 |
* Size: 491,752 training samples
|
| 129 |
* Columns: <code>query</code>, <code>passage</code>, and <code>label</code>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 130 |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
|
| 131 |
```json
|
| 132 |
{
|
|
|
|
| 150 |
- `load_best_model_at_end`: True
|
| 151 |
- `batch_sampler`: no_duplicates
|
| 152 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 153 |
### Training Logs
|
| 154 |
| Epoch | Step | Training Loss | amh-passage-retrieval-dev_ndcg@10 |
|
| 155 |
|:-------:|:---------:|:-------------:|:---------------------------------:|
|
|
|
|
| 156 |
| 1.0 | 7684 | 0.4048 | 0.8289 |
|
| 157 |
| 2.0 | 15368 | 0.2366 | 0.8546 |
|
| 158 |
| 3.0 | 23052 | 0.1588 | 0.8353 |
|
| 159 |
| **4.0** | **30736** | **0.1024** | **0.8551** |
|
|
|
|
| 160 |
|
| 161 |
* The bold row denotes the saved checkpoint.
|
| 162 |
|
|
|
|
| 173 |
|
| 174 |
## Citation
|
| 175 |
|
| 176 |
+
```bibtex
|
| 177 |
+
@inproceedings{alemneh2026amharicir,
|
| 178 |
+
title = {The Multilingual Curse at the Retrieval Layer: Evidence from Amharic},
|
| 179 |
+
author = {Alemneh, Yosef Worku and Mekonnen, Kidist Amde and de Rijke, Maarten},
|
| 180 |
+
booktitle = {Proceedings of the 1st Workshop on Multilinguality in the Era of Large Language Models (MeLLM), ACL 2026},
|
| 181 |
+
year = {2026},
|
| 182 |
+
}
|
| 183 |
+
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|