Text Ranking
sentence-transformers
Safetensors
deberta-v2
cross-encoder
reranker
Generated from Trainer
dataset_size:7419
loss:BinaryCrossEntropyLoss
Eval Results (legacy)
text-embeddings-inference
Instructions to use ColeH0415/comp90042-crossencoder-factcheck with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use ColeH0415/comp90042-crossencoder-factcheck with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("ColeH0415/comp90042-crossencoder-factcheck") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
CE fine-tuned epoch 1/3 best_val=0.5285
Browse files- README.md +38 -38
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -21,32 +21,32 @@ model-index:
|
|
| 21 |
- name: CrossEncoder based on cross-encoder/ms-marco-MiniLM-L6-v2
|
| 22 |
results:
|
| 23 |
- task:
|
| 24 |
-
type: cross-encoder-
|
| 25 |
-
name: Cross Encoder
|
| 26 |
dataset:
|
| 27 |
name: ce val
|
| 28 |
type: ce-val
|
| 29 |
metrics:
|
| 30 |
- type: accuracy
|
| 31 |
-
value: 0.
|
| 32 |
name: Accuracy
|
| 33 |
- type: accuracy_threshold
|
| 34 |
-
value:
|
| 35 |
name: Accuracy Threshold
|
| 36 |
- type: f1
|
| 37 |
-
value: 0.
|
| 38 |
name: F1
|
| 39 |
- type: f1_threshold
|
| 40 |
-
value: -
|
| 41 |
name: F1 Threshold
|
| 42 |
- type: precision
|
| 43 |
-
value: 0.
|
| 44 |
name: Precision
|
| 45 |
- type: recall
|
| 46 |
-
value: 0.
|
| 47 |
name: Recall
|
| 48 |
- type: average_precision
|
| 49 |
-
value: 0.
|
| 50 |
name: Average Precision
|
| 51 |
---
|
| 52 |
|
|
@@ -99,25 +99,25 @@ from sentence_transformers import CrossEncoder
|
|
| 99 |
model = CrossEncoder("cross_encoder_model_id")
|
| 100 |
# Get scores for pairs of inputs
|
| 101 |
pairs = [
|
| 102 |
-
['
|
| 103 |
-
['
|
| 104 |
-
['
|
| 105 |
-
['
|
| 106 |
-
['
|
| 107 |
]
|
| 108 |
scores = model.predict(pairs)
|
| 109 |
print(scores)
|
| 110 |
-
# [
|
| 111 |
|
| 112 |
# Or rank different texts based on similarity to a single text
|
| 113 |
ranks = model.rank(
|
| 114 |
-
'
|
| 115 |
[
|
| 116 |
-
'
|
| 117 |
-
'
|
| 118 |
-
|
| 119 |
-
'
|
| 120 |
-
'
|
| 121 |
]
|
| 122 |
)
|
| 123 |
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
|
|
@@ -151,20 +151,20 @@ You can finetune this model on your own dataset.
|
|
| 151 |
|
| 152 |
### Metrics
|
| 153 |
|
| 154 |
-
#### Cross Encoder
|
| 155 |
|
| 156 |
* Dataset: `ce-val`
|
| 157 |
-
* Evaluated with [<code>
|
| 158 |
|
| 159 |
| Metric | Value |
|
| 160 |
|:----------------------|:-----------|
|
| 161 |
-
| accuracy | 0.
|
| 162 |
-
| accuracy_threshold |
|
| 163 |
-
| f1 | 0.
|
| 164 |
-
| f1_threshold | -
|
| 165 |
-
| precision | 0.
|
| 166 |
-
| recall | 0.
|
| 167 |
-
| **average_precision** | **0.
|
| 168 |
|
| 169 |
<!--
|
| 170 |
## Bias, Risks and Limitations
|
|
@@ -190,13 +190,13 @@ You can finetune this model on your own dataset.
|
|
| 190 |
| | sentence_0 | sentence_1 | label |
|
| 191 |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 192 |
| type | string | string | float |
|
| 193 |
-
| details | <ul><li>min: 7 tokens</li><li>mean: 27.
|
| 194 |
* Samples:
|
| 195 |
-
| sentence_0
|
| 196 |
-
|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
| 197 |
-
| <code>
|
| 198 |
-
| <code>
|
| 199 |
-
| <code>
|
| 200 |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
|
| 201 |
```json
|
| 202 |
{
|
|
@@ -317,11 +317,11 @@ You can finetune this model on your own dataset.
|
|
| 317 |
### Training Logs
|
| 318 |
| Epoch | Step | ce-val_average_precision |
|
| 319 |
|:-----:|:----:|:------------------------:|
|
| 320 |
-
| -1 | -1 | 0.
|
| 321 |
|
| 322 |
|
| 323 |
### Training Time
|
| 324 |
-
- **Training**:
|
| 325 |
|
| 326 |
### Framework Versions
|
| 327 |
- Python: 3.12.13
|
|
|
|
| 21 |
- name: CrossEncoder based on cross-encoder/ms-marco-MiniLM-L6-v2
|
| 22 |
results:
|
| 23 |
- task:
|
| 24 |
+
type: cross-encoder-classification
|
| 25 |
+
name: Cross Encoder Classification
|
| 26 |
dataset:
|
| 27 |
name: ce val
|
| 28 |
type: ce-val
|
| 29 |
metrics:
|
| 30 |
- type: accuracy
|
| 31 |
+
value: 0.5284848484848484
|
| 32 |
name: Accuracy
|
| 33 |
- type: accuracy_threshold
|
| 34 |
+
value: 3.9093470573425293
|
| 35 |
name: Accuracy Threshold
|
| 36 |
- type: f1
|
| 37 |
+
value: 0.6677471636952999
|
| 38 |
name: F1
|
| 39 |
- type: f1_threshold
|
| 40 |
+
value: -10.441707611083984
|
| 41 |
name: F1 Threshold
|
| 42 |
- type: precision
|
| 43 |
+
value: 0.5018270401948843
|
| 44 |
name: Precision
|
| 45 |
- type: recall
|
| 46 |
+
value: 0.9975786924939467
|
| 47 |
name: Recall
|
| 48 |
- type: average_precision
|
| 49 |
+
value: 0.5295137880648401
|
| 50 |
name: Average Precision
|
| 51 |
---
|
| 52 |
|
|
|
|
| 99 |
model = CrossEncoder("cross_encoder_model_id")
|
| 100 |
# Get scores for pairs of inputs
|
| 101 |
pairs = [
|
| 102 |
+
['They (Clinton and Obama) have never to my knowledge been involved in legislation nor hearings nor engagement on this issue (climate change).', 'Gore has been involved with environmental issues since 1976, when as a freshman congressman, he held the "first congressional hearings on the climate change, and co-sponsor[ed] hearings on toxic waste and global warming."'],
|
| 103 |
+
['This increase is the result of humans emitting more carbon dioxide into the atmosphere and hence more being absorbed into the oceans.', 'Humans have a substantial influence on the rise of sea level because we emit increasing levels of carbon dioxide into the atmosphere through automobile use and industry.'],
|
| 104 |
+
["Venus doesn't have a runaway greenhouse effect", "More recent studies have suggested that several billion years ago, Venus's atmosphere was much more like Earth's than it is now and that there were probably substantial quantities of liquid water on the surface, but a runaway greenhouse effect was caused by the evaporation of that original water, which generated a critical level of greenhouse gases in its atmosphere."],
|
| 105 |
+
['At four degrees, the deadly European heat wave of 2003, which killed as many as 2,000 people a day, will be a normal summer.', 'For comparison, the 2003 European heat wave killed an estimated 35,000–70,000 people, with temperatures slightly less than in India and Pakistan.'],
|
| 106 |
+
['Under the most ambitious scenarios, they found a strong likelihood that Antarctica would remain fairly stable.”', 'Its remains have been found in Africa, Antarctica, Europe, and North America.'],
|
| 107 |
]
|
| 108 |
scores = model.predict(pairs)
|
| 109 |
print(scores)
|
| 110 |
+
# [ 2.0512 5.9885 7.7773 5.7437 -3.7705]
|
| 111 |
|
| 112 |
# Or rank different texts based on similarity to a single text
|
| 113 |
ranks = model.rank(
|
| 114 |
+
'They (Clinton and Obama) have never to my knowledge been involved in legislation nor hearings nor engagement on this issue (climate change).',
|
| 115 |
[
|
| 116 |
+
'Gore has been involved with environmental issues since 1976, when as a freshman congressman, he held the "first congressional hearings on the climate change, and co-sponsor[ed] hearings on toxic waste and global warming."',
|
| 117 |
+
'Humans have a substantial influence on the rise of sea level because we emit increasing levels of carbon dioxide into the atmosphere through automobile use and industry.',
|
| 118 |
+
"More recent studies have suggested that several billion years ago, Venus's atmosphere was much more like Earth's than it is now and that there were probably substantial quantities of liquid water on the surface, but a runaway greenhouse effect was caused by the evaporation of that original water, which generated a critical level of greenhouse gases in its atmosphere.",
|
| 119 |
+
'For comparison, the 2003 European heat wave killed an estimated 35,000–70,000 people, with temperatures slightly less than in India and Pakistan.',
|
| 120 |
+
'Its remains have been found in Africa, Antarctica, Europe, and North America.',
|
| 121 |
]
|
| 122 |
)
|
| 123 |
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
|
|
|
|
| 151 |
|
| 152 |
### Metrics
|
| 153 |
|
| 154 |
+
#### Cross Encoder Classification
|
| 155 |
|
| 156 |
* Dataset: `ce-val`
|
| 157 |
+
* Evaluated with [<code>CrossEncoderClassificationEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderClassificationEvaluator)
|
| 158 |
|
| 159 |
| Metric | Value |
|
| 160 |
|:----------------------|:-----------|
|
| 161 |
+
| accuracy | 0.5285 |
|
| 162 |
+
| accuracy_threshold | 3.9093 |
|
| 163 |
+
| f1 | 0.6677 |
|
| 164 |
+
| f1_threshold | -10.4417 |
|
| 165 |
+
| precision | 0.5018 |
|
| 166 |
+
| recall | 0.9976 |
|
| 167 |
+
| **average_precision** | **0.5295** |
|
| 168 |
|
| 169 |
<!--
|
| 170 |
## Bias, Risks and Limitations
|
|
|
|
| 190 |
| | sentence_0 | sentence_1 | label |
|
| 191 |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 192 |
| type | string | string | float |
|
| 193 |
+
| details | <ul><li>min: 7 tokens</li><li>mean: 27.81 tokens</li><li>max: 82 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 32.86 tokens</li><li>max: 247 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.49</li><li>max: 1.0</li></ul> |
|
| 194 |
* Samples:
|
| 195 |
+
| sentence_0 | sentence_1 | label |
|
| 196 |
+
|:----------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
|
| 197 |
+
| <code>They (Clinton and Obama) have never to my knowledge been involved in legislation nor hearings nor engagement on this issue (climate change).</code> | <code>Gore has been involved with environmental issues since 1976, when as a freshman congressman, he held the "first congressional hearings on the climate change, and co-sponsor[ed] hearings on toxic waste and global warming."</code> | <code>1.0</code> |
|
| 198 |
+
| <code>This increase is the result of humans emitting more carbon dioxide into the atmosphere and hence more being absorbed into the oceans.</code> | <code>Humans have a substantial influence on the rise of sea level because we emit increasing levels of carbon dioxide into the atmosphere through automobile use and industry.</code> | <code>0.0</code> |
|
| 199 |
+
| <code>Venus doesn't have a runaway greenhouse effect</code> | <code>More recent studies have suggested that several billion years ago, Venus's atmosphere was much more like Earth's than it is now and that there were probably substantial quantities of liquid water on the surface, but a runaway greenhouse effect was caused by the evaporation of that original water, which generated a critical level of greenhouse gases in its atmosphere.</code> | <code>0.0</code> |
|
| 200 |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
|
| 201 |
```json
|
| 202 |
{
|
|
|
|
| 317 |
### Training Logs
|
| 318 |
| Epoch | Step | ce-val_average_precision |
|
| 319 |
|:-----:|:----:|:------------------------:|
|
| 320 |
+
| -1 | -1 | 0.5295 |
|
| 321 |
|
| 322 |
|
| 323 |
### Training Time
|
| 324 |
+
- **Training**: 34.4 seconds
|
| 325 |
|
| 326 |
### Framework Versions
|
| 327 |
- Python: 3.12.13
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 90866404
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c50a517d1275313dcf63127bb3c7e7282772be50cbbe9f885894b98627ea0d19
|
| 3 |
size 90866404
|