Text Ranking
sentence-transformers
Safetensors
Amharic
xlm-roberta
cross-encoder
Generated from Trainer
dataset_size:491752
loss:BinaryCrossEntropyLoss
Eval Results (legacy)
text-embeddings-inference
Instructions to use rasyosef/reranker-amharic-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use rasyosef/reranker-amharic-base with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("rasyosef/reranker-amharic-base") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -25,24 +25,14 @@ model-index:
|
|
| 25 |
name: amh passage retrieval dev
|
| 26 |
type: amh-passage-retrieval-dev
|
| 27 |
metrics:
|
| 28 |
-
- type: map
|
| 29 |
-
value: 0.8534696098981692
|
| 30 |
-
name: Map
|
| 31 |
- type: mrr@10
|
| 32 |
-
value: 0.
|
| 33 |
name: Mrr@10
|
| 34 |
- type: ndcg@10
|
| 35 |
-
value: 0.
|
| 36 |
-
name: Ndcg@10
|
| 37 |
-
- type: map
|
| 38 |
-
value: 0.8531100005859653
|
| 39 |
-
name: Map
|
| 40 |
-
- type: mrr@10
|
| 41 |
-
value: 0.8513254037953979
|
| 42 |
-
name: Mrr@10
|
| 43 |
-
- type: ndcg@10
|
| 44 |
-
value: 0.8802377937215004
|
| 45 |
name: Ndcg@10
|
|
|
|
|
|
|
| 46 |
---
|
| 47 |
|
| 48 |
# roberta-amharic-reranker-base
|
|
@@ -82,7 +72,7 @@ Then you can load this model and run inference.
|
|
| 82 |
from sentence_transformers import CrossEncoder
|
| 83 |
|
| 84 |
# Download from the 🤗 Hub
|
| 85 |
-
model = CrossEncoder("rasyosef/
|
| 86 |
# Get scores for pairs of texts
|
| 87 |
|
| 88 |
pairs = [
|
|
@@ -150,25 +140,8 @@ You can finetune this model on your own dataset.
|
|
| 150 |
|
| 151 |
| Metric | Value |
|
| 152 |
|:------------|:-----------|
|
| 153 |
-
|
|
| 154 |
-
|
|
| 155 |
-
| **ndcg@10** | **0.8815** |
|
| 156 |
-
|
| 157 |
-
#### Cross Encoder Reranking
|
| 158 |
-
|
| 159 |
-
* Dataset: `amh-passage-retrieval-dev`
|
| 160 |
-
* Evaluated with [<code>CrossEncoderRerankingEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderRerankingEvaluator) with these parameters:
|
| 161 |
-
```json
|
| 162 |
-
{
|
| 163 |
-
"at_k": 10
|
| 164 |
-
}
|
| 165 |
-
```
|
| 166 |
-
|
| 167 |
-
| Metric | Value |
|
| 168 |
-
|:------------|:-----------|
|
| 169 |
-
| map | 0.8531 |
|
| 170 |
-
| mrr@10 | 0.8513 |
|
| 171 |
-
| **ndcg@10** | **0.8802** |
|
| 172 |
|
| 173 |
<!--
|
| 174 |
## Bias, Risks and Limitations
|
|
@@ -184,6 +157,8 @@ You can finetune this model on your own dataset.
|
|
| 184 |
|
| 185 |
## Training Details
|
| 186 |
|
|
|
|
|
|
|
| 187 |
### Training Dataset
|
| 188 |
|
| 189 |
#### Unnamed Dataset
|
|
@@ -363,22 +338,9 @@ You can finetune this model on your own dataset.
|
|
| 363 |
- Datasets: 3.6.0
|
| 364 |
- Tokenizers: 0.21.1
|
| 365 |
|
| 366 |
-
|
| 367 |
|
| 368 |
-
##
|
| 369 |
-
|
| 370 |
-
#### Sentence Transformers
|
| 371 |
-
```bibtex
|
| 372 |
-
@inproceedings{reimers-2019-sentence-bert,
|
| 373 |
-
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
|
| 374 |
-
author = "Reimers, Nils and Gurevych, Iryna",
|
| 375 |
-
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
|
| 376 |
-
month = "11",
|
| 377 |
-
year = "2019",
|
| 378 |
-
publisher = "Association for Computational Linguistics",
|
| 379 |
-
url = "https://arxiv.org/abs/1908.10084",
|
| 380 |
-
}
|
| 381 |
-
```
|
| 382 |
|
| 383 |
<!--
|
| 384 |
## Glossary
|
|
|
|
| 25 |
name: amh passage retrieval dev
|
| 26 |
type: amh-passage-retrieval-dev
|
| 27 |
metrics:
|
|
|
|
|
|
|
|
|
|
| 28 |
- type: mrr@10
|
| 29 |
+
value: 0.83
|
| 30 |
name: Mrr@10
|
| 31 |
- type: ndcg@10
|
| 32 |
+
value: 0.856
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
name: Ndcg@10
|
| 34 |
+
datasets:
|
| 35 |
+
- rasyosef/Amharic-Passage-Retrieval-Dataset-V2
|
| 36 |
---
|
| 37 |
|
| 38 |
# roberta-amharic-reranker-base
|
|
|
|
| 72 |
from sentence_transformers import CrossEncoder
|
| 73 |
|
| 74 |
# Download from the 🤗 Hub
|
| 75 |
+
model = CrossEncoder("rasyosef/reranker-amharic-base")
|
| 76 |
# Get scores for pairs of texts
|
| 77 |
|
| 78 |
pairs = [
|
|
|
|
| 140 |
|
| 141 |
| Metric | Value |
|
| 142 |
|:------------|:-----------|
|
| 143 |
+
| mrr@10 | 0.830 |
|
| 144 |
+
| **ndcg@10** | **0.856** |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 145 |
|
| 146 |
<!--
|
| 147 |
## Bias, Risks and Limitations
|
|
|
|
| 157 |
|
| 158 |
## Training Details
|
| 159 |
|
| 160 |
+
<details>
|
| 161 |
+
|
| 162 |
### Training Dataset
|
| 163 |
|
| 164 |
#### Unnamed Dataset
|
|
|
|
| 338 |
- Datasets: 3.6.0
|
| 339 |
- Tokenizers: 0.21.1
|
| 340 |
|
| 341 |
+
</details>
|
| 342 |
|
| 343 |
+
## Citation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 344 |
|
| 345 |
<!--
|
| 346 |
## Glossary
|