Text Ranking
sentence-transformers
Safetensors
deberta-v2
cross-encoder
reranker
Generated from Trainer
dataset_size:7419
loss:BinaryCrossEntropyLoss
Eval Results (legacy)
text-embeddings-inference
Instructions to use ColeH0415/comp90042-crossencoder-factcheck with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use ColeH0415/comp90042-crossencoder-factcheck with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("ColeH0415/comp90042-crossencoder-factcheck") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
CE fine-tuned epoch 2/3 best_val=0.6352
Browse files- README.md +38 -34
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -28,25 +28,25 @@ model-index:
|
|
| 28 |
type: ce-val
|
| 29 |
metrics:
|
| 30 |
- type: accuracy
|
| 31 |
-
value: 0.
|
| 32 |
name: Accuracy
|
| 33 |
- type: accuracy_threshold
|
| 34 |
-
value: 0.
|
| 35 |
name: Accuracy Threshold
|
| 36 |
- type: f1
|
| 37 |
-
value: 0.
|
| 38 |
name: F1
|
| 39 |
- type: f1_threshold
|
| 40 |
-
value: 0.
|
| 41 |
name: F1 Threshold
|
| 42 |
- type: precision
|
| 43 |
-
value: 0.
|
| 44 |
name: Precision
|
| 45 |
- type: recall
|
| 46 |
-
value: 0.
|
| 47 |
name: Recall
|
| 48 |
- type: average_precision
|
| 49 |
-
value: 0.
|
| 50 |
name: Average Precision
|
| 51 |
---
|
| 52 |
|
|
@@ -99,25 +99,25 @@ from sentence_transformers import CrossEncoder
|
|
| 99 |
model = CrossEncoder("cross_encoder_model_id")
|
| 100 |
# Get scores for pairs of inputs
|
| 101 |
pairs = [
|
| 102 |
-
['
|
| 103 |
-
['
|
| 104 |
-
['
|
| 105 |
-
['
|
| 106 |
-
['
|
| 107 |
]
|
| 108 |
scores = model.predict(pairs)
|
| 109 |
print(scores)
|
| 110 |
-
# [0.
|
| 111 |
|
| 112 |
# Or rank different texts based on similarity to a single text
|
| 113 |
ranks = model.rank(
|
| 114 |
-
'
|
| 115 |
[
|
| 116 |
-
'
|
| 117 |
-
'
|
| 118 |
-
'
|
| 119 |
-
'
|
| 120 |
-
'
|
| 121 |
]
|
| 122 |
)
|
| 123 |
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
|
|
@@ -156,15 +156,15 @@ You can finetune this model on your own dataset.
|
|
| 156 |
* Dataset: `ce-val`
|
| 157 |
* Evaluated with [<code>CrossEncoderClassificationEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderClassificationEvaluator)
|
| 158 |
|
| 159 |
-
| Metric | Value
|
| 160 |
-
|:----------------------|:----------|
|
| 161 |
-
| accuracy | 0.
|
| 162 |
-
| accuracy_threshold | 0.
|
| 163 |
-
| f1 | 0.
|
| 164 |
-
| f1_threshold | 0.
|
| 165 |
-
| precision | 0.
|
| 166 |
-
| recall | 0.
|
| 167 |
-
| **average_precision** | **0.
|
| 168 |
|
| 169 |
<!--
|
| 170 |
## Bias, Risks and Limitations
|
|
@@ -190,13 +190,13 @@ You can finetune this model on your own dataset.
|
|
| 190 |
| | sentence_0 | sentence_1 | label |
|
| 191 |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 192 |
| type | string | string | float |
|
| 193 |
-
| details | <ul><li>min: 7 tokens</li><li>mean:
|
| 194 |
* Samples:
|
| 195 |
-
| sentence_0
|
| 196 |
-
|:----------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
|
| 197 |
-
| <code>
|
| 198 |
-
| <code>
|
| 199 |
-
| <code>
|
| 200 |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
|
| 201 |
```json
|
| 202 |
{
|
|
@@ -322,6 +322,10 @@ You can finetune this model on your own dataset.
|
|
| 322 |
| 0.5391 | 1000 | 0.6883 | - |
|
| 323 |
| 0.8086 | 1500 | 0.6841 | - |
|
| 324 |
| -1 | -1 | - | 0.5670 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 325 |
|
| 326 |
|
| 327 |
### Training Time
|
|
|
|
| 28 |
type: ce-val
|
| 29 |
metrics:
|
| 30 |
- type: accuracy
|
| 31 |
+
value: 0.6351515151515151
|
| 32 |
name: Accuracy
|
| 33 |
- type: accuracy_threshold
|
| 34 |
+
value: 0.5755879878997803
|
| 35 |
name: Accuracy Threshold
|
| 36 |
- type: f1
|
| 37 |
+
value: 0.6981132075471698
|
| 38 |
name: F1
|
| 39 |
- type: f1_threshold
|
| 40 |
+
value: 0.41324031352996826
|
| 41 |
name: F1 Threshold
|
| 42 |
- type: precision
|
| 43 |
+
value: 0.5405046480743692
|
| 44 |
name: Precision
|
| 45 |
- type: recall
|
| 46 |
+
value: 0.9854721549636803
|
| 47 |
name: Recall
|
| 48 |
- type: average_precision
|
| 49 |
+
value: 0.6568676757267966
|
| 50 |
name: Average Precision
|
| 51 |
---
|
| 52 |
|
|
|
|
| 99 |
model = CrossEncoder("cross_encoder_model_id")
|
| 100 |
# Get scores for pairs of inputs
|
| 101 |
pairs = [
|
| 102 |
+
['The last time the planet was even four degrees warmer, Peter Brannen points out in The Ends of the World, his new history of the planet’s major extinction events, the oceans were hundreds of feet higher.', 'Almost all scientists acknowledge that the rate of species loss is greater now than at any time in human history, with extinctions occurring at rates hundreds of times higher than background extinction rates.'],
|
| 103 |
+
['[S]unspot activity on the surface of our star has dropped to a new low.', 'This surface activity produces starspots, which are regions of strong magnetic fields and lower than normal surface temperatures.'],
|
| 104 |
+
['More money is dedicated within the Department of Homeland Security to climate change than what\'s spent combating "Islamist terrorists radicalizing over the Internet in the United States of America."', "The center works on the Internet's routing infrastructure (the SPRI program) and Domain Name System (DNSSEC), identity theft and other online criminal activity (ITTC), Internet traffic and networks research (PREDICT datasets and the DETER testbed), Department of Defense and HSARPA exercises (Livewire and Determined Promise), and wireless security in cooperation with Canada."],
|
| 105 |
+
['Worst-case global heating scenarios may need to be revised upwards in light of a better understanding of the role of clouds, scientists have said.', 'Climate model projections summarized in the report indicated that during the 21st century the global surface temperature is likely to rise a further 0.3 to 1.7\xa0°C (0.5 to 3.1\xa0°F) in a moderate scenario, or as much as 2.6 to 4.8\xa0°C (4.7 to 8.6\xa0°F) in an extreme scenario, depending on the rate of future greenhouse gas emissions and on climate feedback effects.'],
|
| 106 |
+
['Prof Adam Scaife, a climate modelling expert at the UK’s Met Office, said the evidence for a link to shrinking Arctic ice was now good: ‘The consensus points towards that being a real effect.’”', 'Some models of modern climate exhibit Arctic amplification without changes in snow and ice cover.'],
|
| 107 |
]
|
| 108 |
scores = model.predict(pairs)
|
| 109 |
print(scores)
|
| 110 |
+
# [0.6498 0.5873 0.6027 0.6833 0.4922]
|
| 111 |
|
| 112 |
# Or rank different texts based on similarity to a single text
|
| 113 |
ranks = model.rank(
|
| 114 |
+
'The last time the planet was even four degrees warmer, Peter Brannen points out in The Ends of the World, his new history of the planet’s major extinction events, the oceans were hundreds of feet higher.',
|
| 115 |
[
|
| 116 |
+
'Almost all scientists acknowledge that the rate of species loss is greater now than at any time in human history, with extinctions occurring at rates hundreds of times higher than background extinction rates.',
|
| 117 |
+
'This surface activity produces starspots, which are regions of strong magnetic fields and lower than normal surface temperatures.',
|
| 118 |
+
"The center works on the Internet's routing infrastructure (the SPRI program) and Domain Name System (DNSSEC), identity theft and other online criminal activity (ITTC), Internet traffic and networks research (PREDICT datasets and the DETER testbed), Department of Defense and HSARPA exercises (Livewire and Determined Promise), and wireless security in cooperation with Canada.",
|
| 119 |
+
'Climate model projections summarized in the report indicated that during the 21st century the global surface temperature is likely to rise a further 0.3 to 1.7\xa0°C (0.5 to 3.1\xa0°F) in a moderate scenario, or as much as 2.6 to 4.8\xa0°C (4.7 to 8.6\xa0°F) in an extreme scenario, depending on the rate of future greenhouse gas emissions and on climate feedback effects.',
|
| 120 |
+
'Some models of modern climate exhibit Arctic amplification without changes in snow and ice cover.',
|
| 121 |
]
|
| 122 |
)
|
| 123 |
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
|
|
|
|
| 156 |
* Dataset: `ce-val`
|
| 157 |
* Evaluated with [<code>CrossEncoderClassificationEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderClassificationEvaluator)
|
| 158 |
|
| 159 |
+
| Metric | Value |
|
| 160 |
+
|:----------------------|:-----------|
|
| 161 |
+
| accuracy | 0.6352 |
|
| 162 |
+
| accuracy_threshold | 0.5756 |
|
| 163 |
+
| f1 | 0.6981 |
|
| 164 |
+
| f1_threshold | 0.4132 |
|
| 165 |
+
| precision | 0.5405 |
|
| 166 |
+
| recall | 0.9855 |
|
| 167 |
+
| **average_precision** | **0.6569** |
|
| 168 |
|
| 169 |
<!--
|
| 170 |
## Bias, Risks and Limitations
|
|
|
|
| 190 |
| | sentence_0 | sentence_1 | label |
|
| 191 |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 192 |
| type | string | string | float |
|
| 193 |
+
| details | <ul><li>min: 7 tokens</li><li>mean: 26.66 tokens</li><li>max: 80 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 31.75 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.52</li><li>max: 1.0</li></ul> |
|
| 194 |
* Samples:
|
| 195 |
+
| sentence_0 | sentence_1 | label |
|
| 196 |
+
|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
|
| 197 |
+
| <code>The last time the planet was even four degrees warmer, Peter Brannen points out in The Ends of the World, his new history of the planet’s major extinction events, the oceans were hundreds of feet higher.</code> | <code>Almost all scientists acknowledge that the rate of species loss is greater now than at any time in human history, with extinctions occurring at rates hundreds of times higher than background extinction rates.</code> | <code>0.0</code> |
|
| 198 |
+
| <code>[S]unspot activity on the surface of our star has dropped to a new low.</code> | <code>This surface activity produces starspots, which are regions of strong magnetic fields and lower than normal surface temperatures.</code> | <code>1.0</code> |
|
| 199 |
+
| <code>More money is dedicated within the Department of Homeland Security to climate change than what's spent combating "Islamist terrorists radicalizing over the Internet in the United States of America."</code> | <code>The center works on the Internet's routing infrastructure (the SPRI program) and Domain Name System (DNSSEC), identity theft and other online criminal activity (ITTC), Internet traffic and networks research (PREDICT datasets and the DETER testbed), Department of Defense and HSARPA exercises (Livewire and Determined Promise), and wireless security in cooperation with Canada.</code> | <code>1.0</code> |
|
| 200 |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
|
| 201 |
```json
|
| 202 |
{
|
|
|
|
| 322 |
| 0.5391 | 1000 | 0.6883 | - |
|
| 323 |
| 0.8086 | 1500 | 0.6841 | - |
|
| 324 |
| -1 | -1 | - | 0.5670 |
|
| 325 |
+
| 0.2695 | 500 | 0.6741 | - |
|
| 326 |
+
| 0.5391 | 1000 | 0.6662 | - |
|
| 327 |
+
| 0.8086 | 1500 | 0.6504 | - |
|
| 328 |
+
| -1 | -1 | - | 0.6569 |
|
| 329 |
|
| 330 |
|
| 331 |
### Training Time
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 737716172
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2c1672b14f00ee02f13738af219d2591d8cd95fbc93e474a2647541c519f7e23
|
| 3 |
size 737716172
|