Improve model card: add paper, code links and citation

#2
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +33 -207
README.md CHANGED
@@ -1,20 +1,22 @@
1
  ---
 
 
 
2
  language:
3
  - am
 
4
  license: mit
 
 
 
 
 
5
  tags:
6
  - sentence-transformers
7
  - cross-encoder
8
  - generated_from_trainer
9
  - dataset_size:491752
10
  - loss:BinaryCrossEntropyLoss
11
- base_model: rasyosef/roberta-medium-amharic
12
- pipeline_tag: text-ranking
13
- library_name: sentence-transformers
14
- metrics:
15
- - map
16
- - mrr@10
17
- - ndcg@10
18
  model-index:
19
  - name: roberta-amharic-reranker-medium
20
  results:
@@ -31,32 +33,32 @@ model-index:
31
  - type: ndcg@10
32
  value: 0.835
33
  name: Ndcg@10
34
-
35
- datasets:
36
- - rasyosef/Amharic-Passage-Retrieval-Dataset-V2
37
  ---
38
 
39
  # reranker-amharic-medium
40
 
41
  This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [rasyosef/roberta-medium-amharic](https://huggingface.co/rasyosef/roberta-medium-amharic) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
42
 
 
 
 
 
 
43
  ## Model Details
44
 
45
  ### Model Description
46
  - **Model Type:** Cross Encoder
47
- - **Base model:** [rasyosef/roberta-medium-amharic](https://huggingface.co/rasyosef/roberta-medium-amharic) <!-- at revision 9d02d0281e64d6ca31bd06d322e14b0b7e60375b -->
48
  - **Maximum Sequence Length:** 510 tokens
49
  - **Number of Output Labels:** 1 label
50
- <!-- - **Training Dataset:** Unknown -->
51
- - **Language:** am
52
- - **License:** mit
53
 
54
  ### Model Sources
55
 
56
  - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
57
  - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
58
  - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
59
- - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
60
 
61
  ## Usage
62
 
@@ -74,10 +76,11 @@ from sentence_transformers import CrossEncoder
74
 
75
  # Download from the 🤗 Hub
76
  model = CrossEncoder("rasyosef/reranker-amharic-medium")
 
77
  # Get scores for pairs of texts
78
  pairs = [
79
- ['ለውጭ ገበያ በሚቀርበው የኢትዮጵያ ቡና ላይ የተጋረጠው ፈተና', 'የኢትዮጵያ ዋነኛ የውጭ ምንዛሬ ምንጭ የሆነው ወደ ውጭ የሚላክ ቡና ዘርፍ በአሁኑ ጊዜ ከፍተኛ ውጥረት ውስጥ ገብቷል። በዚህ የተነሳም የኢትዮጵያ ቡናና ሻይ ባለሥልጣንን ጨምሮ የሚመላካታቸው ሁሉ ቡና ላኪዎችና አምራቾች ያከማቹትን ቡና በፍጥነት ወደ ዓለም ገበያ እንዲያወጡ ጥሪ እያቀረቡ ነው ።'],
80
- ['ለውጭ ገበያ በሚቀርበው የኢትዮጵያ ቡና ላይ የተጋረጠው ፈተና', 'የቻይናው ፕሬዝዳንት ዚ ጂንፒንግ ከትራምፕ ጋር ባደረጉት ጉባኤ ትኩረታቸው በሁለቱ ሀገራት መካከል ለወራት ከተፈጠረ ውጥረት እና የንግድ ጦርነት በኋላ የተረገጋጋ ግንኙነትን ማስቀጠል ነበር። ከፑቲን ጋር ደግሞ ዢ ለሁለቱ አገራት ስልታዊም ሆነ ኢኮኖሚያዊ ጠቀሜታ ረጅም ጊዜ የዘለቀውን አጋርነትን ይበልጥ ማጠናከር ላይ ነበር ትኩረታቸው።']
81
  ]
82
  scores = model.predict(pairs)
83
  print(scores.shape)
@@ -87,37 +90,14 @@ print(scores.shape)
87
  ranks = model.rank(
88
  'ለውጭ ገበያ በሚቀርበው የኢትዮጵያ ቡና ላይ የተጋረጠው ፈተና',
89
  [
90
- 'የኢትዮጵያ ዋነኛ የውጭ ምንዛሬ ምንጭ የሆነው ወደ ውጭ የሚላክ ቡና ዘርፍ በአሁኑ ጊዜ ከፍተኛ ውጥረት ውስጥ ገብቷል። በዚህ የተነሳም የኢትዮጵያ ቡናና ሻይ ባለሥልጣንን ጨምሮ የሚመላካታቸው ሁሉ ቡና ላኪዎችና አምራቾች ያከማቹትን ቡና በፍጥነት ወደ ዓለም ገበያ እንዲያወጡ ጥሪ እያቀረቡ ነው ።',
91
- 'የቻይናው ፕሬዝዳንት ዚ ጂንፒንግ ከትራምፕ ጋር ባደረጉት ጉባኤ ትኩረታቸው በሁለቱ ሀገራት መካከል ለወራት ከተፈጠረ ውጥረት እና የንግድ ጦርነት በኋላ የተረገጋጋ ግንኙነትን ማስቀጠል ነበር። ከፑቲን ጋር ደግሞ ዢ ለሁለቱ አገራት ስልታዊም ሆነ ኢኮኖሚያዊ ጠቀሜታ ረጅም ጊዜ የዘለቀውን አጋርነትን ይበልጥ ማጠናከር ላይ ነበር ትኩረታቸው።',
92
  ]
93
  )
94
- # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
 
95
  ```
96
 
97
- <!--
98
- ### Direct Usage (Transformers)
99
-
100
- <details><summary>Click to see the direct usage in Transformers</summary>
101
-
102
- </details>
103
- -->
104
-
105
- <!--
106
- ### Downstream Usage (Sentence Transformers)
107
-
108
- You can finetune this model on your own dataset.
109
-
110
- <details><summary>Click to expand</summary>
111
-
112
- </details>
113
- -->
114
-
115
- <!--
116
- ### Out-of-Scope Use
117
-
118
- *List how the model may foreseeably be misused and address what users ought not to do with the model.*
119
- -->
120
-
121
  ## Evaluation
122
 
123
  ### Metrics
@@ -137,39 +117,16 @@ You can finetune this model on your own dataset.
137
  | mrr@10 | 0.805 |
138
  | **ndcg@10** | **0.835** |
139
 
140
- <!--
141
- ## Bias, Risks and Limitations
142
-
143
- *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
144
- -->
145
-
146
- <!--
147
- ### Recommendations
148
-
149
- *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
150
- -->
151
-
152
  ## Training Details
153
 
154
  <details>
155
 
156
  ### Training Dataset
157
 
158
- #### Unnamed Dataset
159
 
160
  * Size: 491,752 training samples
161
  * Columns: <code>query</code>, <code>passage</code>, and <code>label</code>
162
- * Approximate statistics based on the first 1000 samples:
163
- | | query | passage | label |
164
- |:--------|:-----------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------|:------------------------------------------------|
165
- | type | string | string | int |
166
- | details | <ul><li>min: 2 characters</li><li>mean: 49.94 characters</li><li>max: 283 characters</li></ul> | <ul><li>min: 126 characters</li><li>mean: 1418.88 characters</li><li>max: 8678 characters</li></ul> | <ul><li>0: ~87.40%</li><li>1: ~12.60%</li></ul> |
167
- * Samples:
168
- | query | passage | label |
169
- |:------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
170
- | <code>በባሌ፣ ቦረና እና ጉጂ ዞኖች የተከሰተውን የበርሃ አንበጣ ለመከላከል ተጨማሪ አውሮፕላኖች ወደ ስፍራው ይሰማራሉ</code> | <code>አዲስ አበባ ፣ ታህሳስ 27 ፣ 2012 (ኤፍ ቢ ሲ) የጃፓኑ ጠቅላይ ሚኒስትር ሺንዞ አቤ በመካከለኛው ምስራቅ ሃይል የማስፈር እቅድ እንዳላቸው በድጋሚ ገለጹ።ጠቅላይ ሚኒስትሩ በአካባቢው የሚንቀሳቀሱ የጃፓን መርከቦችን ደህንነት ለማረጋገጥ በስፍራው ሃይል የማስፈር እቅድ እንዳላቸው ገልጸዋል።ባለፈው ወር ጃፓን ወደ መካከለኛው ምስራቅ የጦር መርከቦችን እና ቃኝ አውሮፕላኖችን እንደምትልክ ገልጻ ነበር።የሃገሪቱ መከላከያ ሚኒስቴርም ቃኝ አውሮፕላኖቹ በተያዘው የፈረንጆቹ ጥር ወር ወደ ስፍራው እንደሚያቀኑ ገልጿል።የካቲት ወር ላይ ደግሞ የጦር መርከቦችን ወደ ስፍራው አንቀሳቅሳለሁ ብሏል።የአሁኑ የቶኪዮ እቅድ በመካከለኛው ምስራቅ የባህር ክልል የሚንቀሳቀሱ የጃፓን መርከቦችን ከጥቃት ለመከላከልና ደህንነታቸውን ለማረጋገጥ ያለመ ነው ተብሏል።አቤ በንግግራቸው በመካከለኛው ምስራቅ ያለው ወቅታዊ ሁኔታ እንዳሳሰባቸው ጠቅሰው፥ ሃገራትም አላስፈላጊ ውጥረትን እንዲያስወግዱ ጥሪ አቅርበዋል።አሜሪካ ባለፈው ዓርብ የኢራን ብሄራዊ አብዮት ዘብ ጠባቂ ሃይል አዛዥን በባግዳድ አውሮፕላን ማረፊያ ከገደለች በኋላ በመካከለኛው ምስራቅ ውጥረት ነግሷል።ኢራን ለአሜሪካ እርምጃ ከባድ አፀፋዊ ምላሽ እሰጣለሁ ስትል፥ የአሜሪካው ፕሬዚዳንት ዶናልድ ትራምፕም አሜሪካ የከፋ እርምጃ እንደምትወስድ አስጠንቅቀዋል።ምንጭ፦ ሬውተርስ</code> | <code>0</code> |
171
- | <code>ወጣቱ ምንጫቸው ባልተረጋገጠ የማኅበራዊ ሚዲያ መረጃዎች ላይ በመጠመዱ የንባብ ባህሉ መቀነሱን የእንጅባራ ከተማ ነዋሪዎቸ ተናገሩ፡፡</code> | <code>ባሕር ዳር፡ ግንቦት 21/2012 ዓ.ም (አብመድ) የኮሮና ቫይረስ ወረርሽኝ የትምህርት ተቋማት ተማሪዎቻቸውን እንዲበትኑ አስገድዷቸዋል፡፡ተማሪዎቹን ከትምህርት ገበታቸው ማስተጓጎሉ አሉታዊ ተፅዕኖው የከፋ ቢሆንም ስለወረርሽኑ ግንዘቤ በመፍጠር ረገድ ወደ መልካም ዕድል እየቀየሩት ያሉ አሉ፡፡ወደ ሰሜን ሸዋ ዞን በረኸት ወረዳ ባቀናንበት ወቅት ያገኘናቸው ከተለያዩ የሀገሪቱ አቅጣጫዎች ወደ ቤተሰቦቻቸው የተመለሱ ተማሪዎች እጃቸውን አጣጥፈው አልተቀመጡም፡፡ ተማሪዎቹ ለኅብረተሰቡ ስለኮሮና ቫይረስ ወረርሽኝ የሚያወቁትን እያሳወቁ ነው፡፡ተማሪ ሄኖክ ወርቁ በወላይታ ሶዶ ዩኒቨርሲቲ የሦስተኛ ዓመት የጋዜጠኝነት እና ሥነ ተግባቦት ትምህርት ክፍል ተማሪ ነው፡፡ ሄኖክ ወደ ትውልድ ቀዬው ከተመለሰ ጊዜ ጀምሮ የተለያዩ የመገናኛ ዘዴዎችን በመጠቀም ስለኮሮና ቫይረ��� ወረርሽኝ ቅድመ መከላከል ከመንግሥት እና ከጤና ባለሙያዎች የሚወጡ መልእክቶችን ለኅብረተሰቡ እያስገነዘበ ነው፡፡ የግንዛቤ ፈጠራውን በ‘ሚኒ ሚዲያ’፣ በገበያ እና ሰዎች በሚሰባሰቡባቸው ቦታዎች በመገኘት ከጓደኞቹ ጋር እንደሚሠሩም ተናግሯል፡፡ ከግንዛቤ ፈጠራ ጎን ለጎን ደግሞ የዚህ ዓመት ተመራቂ ተማሪ እንደመሆኑ መጠን ጥናታዊ ጽሑፉን እየሠራ ጊዜውን በአግባባቡ እየተጠቀመ እንደሚገኝ ገልጿል፡፡ሌላኛው ያነጋገርነው ተማሪ አብርሃም ገብረኪዳን በወላይታ ሶዶ ዩኒቨርሲቲ ሦስተኛ ዓመት የሕግ ተማሪ ነው፡፡ ኅብረተሰቡ ለኮሮና ቫይረስ ወረርሽኝ እንዳይጋለጥ ሰፈር ለሰፈር፣ በገበያ ቀን ከወረዳው መዲና መተህብላ ከተማ መግቢያና መውጫ አካባቢዎች እጅ እንዲታጠቡ ከማድረግ ጀምሮ የወረርሽኙን ቅድመ መከላከል መልእክቶች በድምጽ ማጉያ (ሞንታርቦ) ተጠቅመው እያስተላለፉ እንደሆነ ተናግሯል፡፡ ተማሪዎቹ በሚያደርጉት የቅስቀሳ ግንዛቤ ማስጨበጫ ሥ...</code> | <code>0</code> |
172
- | <code>አዳማ ከተማ ከ ኢትዮጵያ ቡና – ቀጥታ የፅሁፍ ስርጭት</code> | <code>​79′ አዲስ ግደይተጠናቀቀ!ጨዋታው በሲዳማ ቡና አሸናፊነት ተጠናቀቀ፡፡ ሲዳ በድቻ ላይ ያለውን የበላይነት ሲያከብር ዘንድሮ በሜዳው ያለውን 100% ሪኮርድም አስጠብቋል፡፡ተጨማሪ ደቂቃ – 4 ደቂቃቢጫ ካርድ88′ ዳግም በቀለ አዲስ ግደይ ላይ በሰራው  ጥፋት ቢጫ ካርድ ተመልክቷል፡፡ በሁኔታውም ለአለም ብርሃኑ አላስፈላጊ ድርጊት በመፈፀሙ ቢጫ ተመልክቷል፡፡84′ ዳግም በቀለ ከማዕዘን የተሻማውን ኳስ በግንባሩ ገጭቶ ለጥቂት ወጣበት፡፡ የሚያስቆጭ አጋጣሚ !የተጫዋቸ ለውጥ – ሲዳማ ቡና81′ በረከት አዲሱ ወጥቶ ሙጃይድ  መሃመድ ገብቷል፡፡የተጫዋች ለውጥ – ወላይታ ድቻ አናጋው ባደግ ወጥቶ አብዱልሰመድ አሊ ገብቷል፡፡ጎልልል!!! ሲዳማ ቡና79′ አዲስ ግደይ ከኤሪክ ሙራንዳ የተሻገረለትን ኳስ በግንባሩ ገጭቶ ወደ ግብነት በመቀየር ሲዳማን መሪ አድርጓል፡፡77′ በዛብህ መለዮ ከርቀት በግራ እግሩ መሬት ለመሬት አክርሮ የመታው ኳስ ለጥቂት ወጣ፡፡<br>የተጫዋች ለውጥ – ወላይታ ድቻ 71′ ቴዎድሮስ መንገሻ ወጥሆ ዳግም በቀለ ገብቷል፡፡<br>የተጫዋች ለውጥ – ሲዳማ ቡና71′ አንተነህ ተስፋዬ በጉዳት ወጥቶ ላኪም ሳኒ ገብቷል፡፡65′ በድጋሚ ከመስመር የተሻገረውን ኳስ ኤሪክ ሙራዳ በግንባሩ ገጭቶ የግቡ አግዳሚ መልሶበታል፡፡ ሲዳማ ቡና ጫና ፈጥሮ በማጥቃት ላይ ይገኛል፡፡63′ ከግራ መስመር ወሰኑ ማዜ ያሻማውን ኳስ አዲስ ግደይ በግንባሩ ገጭቶ የግቡን አግዳሚ ታኮ ወጥቷል፡፡የተጫዋች ለውጥ – ወላይታ ድቻ  60′ አማኑኤል ተሾመ ወጥቶ መሳይ አጪሶ ገብቷል፡፡53′ አናጋው ባደግ ከግራ መስመር ያሻገረውን ኳስ በዛብህ መለዮ አገባው ሲባል በግቡ አናት ሰደደው፡፡ የሚያስቆጭ አጋጣሚ!የተጫዋች ለውጥ – ሲዳማ<br>46′ ግሩም አሰፋ ወጥቶ ኤሪክ ሙራንዳ ገብቷል፡፡<br>ተጀመረ!<br>ሁለተኛው አጋማሽ የጨዋታ...</code> | <code>0</code> |
173
  * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
174
  ```json
175
  {
@@ -193,134 +150,13 @@ You can finetune this model on your own dataset.
193
  - `load_best_model_at_end`: True
194
  - `batch_sampler`: no_duplicates
195
 
196
- #### All Hyperparameters
197
- <details><summary>Click to expand</summary>
198
-
199
- - `overwrite_output_dir`: False
200
- - `do_predict`: False
201
- - `eval_strategy`: epoch
202
- - `prediction_loss_only`: True
203
- - `per_device_train_batch_size`: 64
204
- - `per_device_eval_batch_size`: 64
205
- - `per_gpu_train_batch_size`: None
206
- - `per_gpu_eval_batch_size`: None
207
- - `gradient_accumulation_steps`: 1
208
- - `eval_accumulation_steps`: None
209
- - `torch_empty_cache_steps`: None
210
- - `learning_rate`: 4e-05
211
- - `weight_decay`: 0.0
212
- - `adam_beta1`: 0.9
213
- - `adam_beta2`: 0.999
214
- - `adam_epsilon`: 1e-08
215
- - `max_grad_norm`: 1.0
216
- - `num_train_epochs`: 4
217
- - `max_steps`: -1
218
- - `lr_scheduler_type`: cosine
219
- - `lr_scheduler_kwargs`: {}
220
- - `warmup_ratio`: 0.05
221
- - `warmup_steps`: 0
222
- - `log_level`: passive
223
- - `log_level_replica`: warning
224
- - `log_on_each_node`: True
225
- - `logging_nan_inf_filter`: True
226
- - `save_safetensors`: True
227
- - `save_on_each_node`: False
228
- - `save_only_model`: False
229
- - `restore_callback_states_from_checkpoint`: False
230
- - `no_cuda`: False
231
- - `use_cpu`: False
232
- - `use_mps_device`: False
233
- - `seed`: 42
234
- - `data_seed`: None
235
- - `jit_mode_eval`: False
236
- - `use_ipex`: False
237
- - `bf16`: False
238
- - `fp16`: True
239
- - `fp16_opt_level`: O1
240
- - `half_precision_backend`: auto
241
- - `bf16_full_eval`: False
242
- - `fp16_full_eval`: False
243
- - `tf32`: None
244
- - `local_rank`: 0
245
- - `ddp_backend`: None
246
- - `tpu_num_cores`: None
247
- - `tpu_metrics_debug`: False
248
- - `debug`: []
249
- - `dataloader_drop_last`: False
250
- - `dataloader_num_workers`: 2
251
- - `dataloader_prefetch_factor`: None
252
- - `past_index`: -1
253
- - `disable_tqdm`: False
254
- - `remove_unused_columns`: True
255
- - `label_names`: None
256
- - `load_best_model_at_end`: True
257
- - `ignore_data_skip`: False
258
- - `fsdp`: []
259
- - `fsdp_min_num_params`: 0
260
- - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
261
- - `fsdp_transformer_layer_cls_to_wrap`: None
262
- - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
263
- - `deepspeed`: None
264
- - `label_smoothing_factor`: 0.0
265
- - `optim`: adamw_torch
266
- - `optim_args`: None
267
- - `adafactor`: False
268
- - `group_by_length`: False
269
- - `length_column_name`: length
270
- - `ddp_find_unused_parameters`: None
271
- - `ddp_bucket_cap_mb`: None
272
- - `ddp_broadcast_buffers`: False
273
- - `dataloader_pin_memory`: True
274
- - `dataloader_persistent_workers`: False
275
- - `skip_memory_metrics`: True
276
- - `use_legacy_prediction_loop`: False
277
- - `push_to_hub`: False
278
- - `resume_from_checkpoint`: None
279
- - `hub_model_id`: None
280
- - `hub_strategy`: every_save
281
- - `hub_private_repo`: None
282
- - `hub_always_push`: False
283
- - `gradient_checkpointing`: False
284
- - `gradient_checkpointing_kwargs`: None
285
- - `include_inputs_for_metrics`: False
286
- - `include_for_metrics`: []
287
- - `eval_do_concat_batches`: True
288
- - `fp16_backend`: auto
289
- - `push_to_hub_model_id`: None
290
- - `push_to_hub_organization`: None
291
- - `mp_parameters`:
292
- - `auto_find_batch_size`: False
293
- - `full_determinism`: False
294
- - `torchdynamo`: None
295
- - `ray_scope`: last
296
- - `ddp_timeout`: 1800
297
- - `torch_compile`: False
298
- - `torch_compile_backend`: None
299
- - `torch_compile_mode`: None
300
- - `include_tokens_per_second`: False
301
- - `include_num_input_tokens_seen`: False
302
- - `neftune_noise_alpha`: None
303
- - `optim_target_modules`: None
304
- - `batch_eval_metrics`: False
305
- - `eval_on_start`: False
306
- - `use_liger_kernel`: False
307
- - `eval_use_gather_object`: False
308
- - `average_tokens_across_devices`: False
309
- - `prompts`: None
310
- - `batch_sampler`: no_duplicates
311
- - `multi_dataset_batch_sampler`: proportional
312
-
313
- </details>
314
-
315
  ### Training Logs
316
  | Epoch | Step | Training Loss | amh-passage-retrieval-dev_ndcg@10 |
317
  |:-------:|:---------:|:-------------:|:---------------------------------:|
318
- | -1 | -1 | - | 0.0898 |
319
  | 1.0 | 7684 | 0.4048 | 0.8289 |
320
  | 2.0 | 15368 | 0.2366 | 0.8546 |
321
  | 3.0 | 23052 | 0.1588 | 0.8353 |
322
  | **4.0** | **30736** | **0.1024** | **0.8551** |
323
- | -1 | -1 | - | 0.8579 |
324
 
325
  * The bold row denotes the saved checkpoint.
326
 
@@ -337,21 +173,11 @@ You can finetune this model on your own dataset.
337
 
338
  ## Citation
339
 
340
-
341
- <!--
342
- ## Glossary
343
-
344
- *Clearly define terms in order to be accessible across audiences.*
345
- -->
346
-
347
- <!--
348
- ## Model Card Authors
349
-
350
- *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
351
- -->
352
-
353
- <!--
354
- ## Model Card Contact
355
-
356
- *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
357
- -->
 
1
  ---
2
+ base_model: rasyosef/roberta-medium-amharic
3
+ datasets:
4
+ - rasyosef/Amharic-Passage-Retrieval-Dataset-V2
5
  language:
6
  - am
7
+ library_name: sentence-transformers
8
  license: mit
9
+ metrics:
10
+ - map
11
+ - mrr@10
12
+ - ndcg@10
13
+ pipeline_tag: text-ranking
14
  tags:
15
  - sentence-transformers
16
  - cross-encoder
17
  - generated_from_trainer
18
  - dataset_size:491752
19
  - loss:BinaryCrossEntropyLoss
 
 
 
 
 
 
 
20
  model-index:
21
  - name: roberta-amharic-reranker-medium
22
  results:
 
33
  - type: ndcg@10
34
  value: 0.835
35
  name: Ndcg@10
 
 
 
36
  ---
37
 
38
  # reranker-amharic-medium
39
 
40
  This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [rasyosef/roberta-medium-amharic](https://huggingface.co/rasyosef/roberta-medium-amharic) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
41
 
42
+ This model is part of the research presented in the paper **"The Multilingual Curse at the Retrieval Layer: Evidence from Amharic"**.
43
+
44
+ - **Paper:** [The Multilingual Curse at the Retrieval Layer: Evidence from Amharic](https://huggingface.co/papers/2605.24556)
45
+ - **Code:** [https://github.com/rasyosef/amharic-neural-ir](https://github.com/rasyosef/amharic-neural-ir)
46
+
47
  ## Model Details
48
 
49
  ### Model Description
50
  - **Model Type:** Cross Encoder
51
+ - **Base model:** [rasyosef/roberta-medium-amharic](https://huggingface.co/rasyosef/roberta-medium-amharic)
52
  - **Maximum Sequence Length:** 510 tokens
53
  - **Number of Output Labels:** 1 label
54
+ - **Language:** Amharic (am)
55
+ - **License:** MIT
 
56
 
57
  ### Model Sources
58
 
59
  - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
60
  - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
61
  - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
 
62
 
63
  ## Usage
64
 
 
76
 
77
  # Download from the 🤗 Hub
78
  model = CrossEncoder("rasyosef/reranker-amharic-medium")
79
+
80
  # Get scores for pairs of texts
81
  pairs = [
82
+ ['ለውጭ ገበያ በሚቀርበው የኢትዮጵያ ቡና ላይ የተጋረጠው ፈተና', 'የኢትዮጵያ ዋነኛ የውጭ ምንዛሬ ምንጭ የሆነው ወደ ውጭ የሚላክ ቡና ዘርፍ በአሁኑ ጊዜ ከፍተኛ ውጥረት ውስጥ ገብቷል።'],
83
+ ['ለውጭ ገበያ በሚቀርበው የኢትዮጵያ ቡና ላይ የተጋረጠው ፈተና', 'የቻይናው ፕሬዝዳንት ዚ ጂንፒንግ ከትራምፕ ጋር ባደረጉት ጉባኤ ትኩረታቸው በሁለቱ ሀገራት መካከል ለወራት ከተፈጠረ ውጥረት እና የንግድ ጦርነት በኋላ የተረገጋጋ ግንኙነትን ማስቀጠል ነበር።']
84
  ]
85
  scores = model.predict(pairs)
86
  print(scores.shape)
 
90
  ranks = model.rank(
91
  'ለውጭ ገበያ በሚቀርበው የኢትዮጵያ ቡና ላይ የተጋረጠው ፈተና',
92
  [
93
+ 'የኢትዮጵያ ዋነኛ የውጭ ምንዛሬ ምንጭ የሆነው ወደ ውጭ የሚላክ ቡና ዘርፍ በአሁኑ ጊዜ ከፍተኛ ውጥረት ውስጥ ገብቷል።',
94
+ 'የቻይናው ፕሬዝዳንት ዚ ጂንፒንግ ከትራምፕ ጋር ባደረጉት ጉባኤ ትኩረታቸው በሁለቱ ሀገራት መካከል ለወራት ከተፈጠረ ውጥረት እና የንግድ ጦርነት በኋላ የተረገጋጋ ግንኙነትን ማስቀጠል ነበር።',
95
  ]
96
  )
97
+ print(ranks)
98
+ # [{'corpus_id': 0, 'score': ...}, {'corpus_id': 1, 'score': ...}]
99
  ```
100
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
101
  ## Evaluation
102
 
103
  ### Metrics
 
117
  | mrr@10 | 0.805 |
118
  | **ndcg@10** | **0.835** |
119
 
 
 
 
 
 
 
 
 
 
 
 
 
120
  ## Training Details
121
 
122
  <details>
123
 
124
  ### Training Dataset
125
 
126
+ #### Amharic Passage Retrieval Dataset V2
127
 
128
  * Size: 491,752 training samples
129
  * Columns: <code>query</code>, <code>passage</code>, and <code>label</code>
 
 
 
 
 
 
 
 
 
 
 
130
  * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
131
  ```json
132
  {
 
150
  - `load_best_model_at_end`: True
151
  - `batch_sampler`: no_duplicates
152
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
153
  ### Training Logs
154
  | Epoch | Step | Training Loss | amh-passage-retrieval-dev_ndcg@10 |
155
  |:-------:|:---------:|:-------------:|:---------------------------------:|
 
156
  | 1.0 | 7684 | 0.4048 | 0.8289 |
157
  | 2.0 | 15368 | 0.2366 | 0.8546 |
158
  | 3.0 | 23052 | 0.1588 | 0.8353 |
159
  | **4.0** | **30736** | **0.1024** | **0.8551** |
 
160
 
161
  * The bold row denotes the saved checkpoint.
162
 
 
173
 
174
  ## Citation
175
 
176
+ ```bibtex
177
+ @inproceedings{alemneh2026amharicir,
178
+ title = {The Multilingual Curse at the Retrieval Layer: Evidence from Amharic},
179
+ author = {Alemneh, Yosef Worku and Mekonnen, Kidist Amde and de Rijke, Maarten},
180
+ booktitle = {Proceedings of the 1st Workshop on Multilinguality in the Era of Large Language Models (MeLLM), ACL 2026},
181
+ year = {2026},
182
+ }
183
+ ```