Text Ranking
sentence-transformers
Safetensors
deberta-v2
cross-encoder
reranker
Generated from Trainer
dataset_size:7419
loss:BinaryCrossEntropyLoss
Eval Results (legacy)
text-embeddings-inference
Instructions to use ColeH0415/comp90042-crossencoder-factcheck with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use ColeH0415/comp90042-crossencoder-factcheck with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("ColeH0415/comp90042-crossencoder-factcheck") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
CE fine-tuned epoch 1/3 best_val=0.5782
Browse files- README.md +39 -47
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -28,25 +28,25 @@ model-index:
|
|
| 28 |
type: ce-val
|
| 29 |
metrics:
|
| 30 |
- type: accuracy
|
| 31 |
-
value: 0.
|
| 32 |
name: Accuracy
|
| 33 |
- type: accuracy_threshold
|
| 34 |
-
value: 0.
|
| 35 |
name: Accuracy Threshold
|
| 36 |
- type: f1
|
| 37 |
-
value: 0.
|
| 38 |
name: F1
|
| 39 |
- type: f1_threshold
|
| 40 |
-
value: 0.
|
| 41 |
name: F1 Threshold
|
| 42 |
- type: precision
|
| 43 |
-
value: 0.
|
| 44 |
name: Precision
|
| 45 |
- type: recall
|
| 46 |
-
value: 0.
|
| 47 |
name: Recall
|
| 48 |
- type: average_precision
|
| 49 |
-
value: 0.
|
| 50 |
name: Average Precision
|
| 51 |
---
|
| 52 |
|
|
@@ -99,25 +99,25 @@ from sentence_transformers import CrossEncoder
|
|
| 99 |
model = CrossEncoder("cross_encoder_model_id")
|
| 100 |
# Get scores for pairs of inputs
|
| 101 |
pairs = [
|
| 102 |
-
['
|
| 103 |
-
['
|
| 104 |
-
['
|
| 105 |
-
['
|
| 106 |
-
['
|
| 107 |
]
|
| 108 |
scores = model.predict(pairs)
|
| 109 |
print(scores)
|
| 110 |
-
# [0.
|
| 111 |
|
| 112 |
# Or rank different texts based on similarity to a single text
|
| 113 |
ranks = model.rank(
|
| 114 |
-
'
|
| 115 |
[
|
| 116 |
-
'
|
| 117 |
-
'
|
| 118 |
-
'
|
| 119 |
-
'
|
| 120 |
-
'
|
| 121 |
]
|
| 122 |
)
|
| 123 |
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
|
|
@@ -156,15 +156,15 @@ You can finetune this model on your own dataset.
|
|
| 156 |
* Dataset: `ce-val`
|
| 157 |
* Evaluated with [<code>CrossEncoderClassificationEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderClassificationEvaluator)
|
| 158 |
|
| 159 |
-
| Metric | Value
|
| 160 |
-
|:----------------------|:----------
|
| 161 |
-
| accuracy | 0.
|
| 162 |
-
| accuracy_threshold | 0.
|
| 163 |
-
| f1 | 0.
|
| 164 |
-
| f1_threshold | 0.
|
| 165 |
-
| precision | 0.
|
| 166 |
-
| recall | 0.
|
| 167 |
-
| **average_precision** | **0.
|
| 168 |
|
| 169 |
<!--
|
| 170 |
## Bias, Risks and Limitations
|
|
@@ -190,13 +190,13 @@ You can finetune this model on your own dataset.
|
|
| 190 |
| | sentence_0 | sentence_1 | label |
|
| 191 |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 192 |
| type | string | string | float |
|
| 193 |
-
| details | <ul><li>min: 7 tokens</li><li>mean:
|
| 194 |
* Samples:
|
| 195 |
-
| sentence_0
|
| 196 |
-
|:----------------------------------------------------------------------------------------------------------------------------------------------------
|
| 197 |
-
| <code>
|
| 198 |
-
| <code>
|
| 199 |
-
| <code>
|
| 200 |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
|
| 201 |
```json
|
| 202 |
{
|
|
@@ -318,22 +318,14 @@ You can finetune this model on your own dataset.
|
|
| 318 |
### Training Logs
|
| 319 |
| Epoch | Step | Training Loss | ce-val_average_precision |
|
| 320 |
|:------:|:----:|:-------------:|:------------------------:|
|
| 321 |
-
| 0.2695 | 500 | 0.
|
| 322 |
-
| 0.5391 | 1000 | 0.
|
| 323 |
-
| 0.8086 | 1500 | 0.
|
| 324 |
-
| -1 | -1 | - | 0.
|
| 325 |
-
| 0.2695 | 500 | 0.6835 | - |
|
| 326 |
-
| 0.5391 | 1000 | 0.6840 | - |
|
| 327 |
-
| 0.8086 | 1500 | 0.6848 | - |
|
| 328 |
-
| -1 | -1 | - | 0.6083 |
|
| 329 |
-
| 0.2695 | 500 | 0.6379 | - |
|
| 330 |
-
| 0.5391 | 1000 | 0.6424 | - |
|
| 331 |
-
| 0.8086 | 1500 | 0.6501 | - |
|
| 332 |
-
| -1 | -1 | - | 0.6675 |
|
| 333 |
|
| 334 |
|
| 335 |
### Training Time
|
| 336 |
-
- **Training**:
|
| 337 |
|
| 338 |
### Framework Versions
|
| 339 |
- Python: 3.12.13
|
|
|
|
| 28 |
type: ce-val
|
| 29 |
metrics:
|
| 30 |
- type: accuracy
|
| 31 |
+
value: 0.5781818181818181
|
| 32 |
name: Accuracy
|
| 33 |
- type: accuracy_threshold
|
| 34 |
+
value: 0.5230777859687805
|
| 35 |
name: Accuracy Threshold
|
| 36 |
- type: f1
|
| 37 |
+
value: 0.6700942587832047
|
| 38 |
name: F1
|
| 39 |
- type: f1_threshold
|
| 40 |
+
value: 0.4445345997810364
|
| 41 |
name: F1 Threshold
|
| 42 |
- type: precision
|
| 43 |
+
value: 0.5185676392572944
|
| 44 |
name: Precision
|
| 45 |
- type: recall
|
| 46 |
+
value: 0.9467312348668281
|
| 47 |
name: Recall
|
| 48 |
- type: average_precision
|
| 49 |
+
value: 0.5669943455221101
|
| 50 |
name: Average Precision
|
| 51 |
---
|
| 52 |
|
|
|
|
| 99 |
model = CrossEncoder("cross_encoder_model_id")
|
| 100 |
# Get scores for pairs of inputs
|
| 101 |
pairs = [
|
| 102 |
+
['If every house in Florida had a solar-heated water tank, that would eliminate consumption by 17 percent.', 'Solar water heating (SWH) is the conversion of sunlight into heat for water heating using a solar thermal collector.'],
|
| 103 |
+
['Modellers assume carbon dioxide drives climate change', 'They absorb a huge amount of carbon dioxide, combating climate change.'],
|
| 104 |
+
['Some, however, bristle at the belief that because floods and storms have always occurred, they should not be linked to climate change”', 'Although some studies have reported an increase in frequency and intensity of extremes in rainfall during the past 40–50 years, their attribution to global warming is not established."'],
|
| 105 |
+
['The tax-payer funded National Oceanic and Atmospheric Administration (NOAA) has become mired in fresh global warming data scandal involving numbers for the Great Lakes region that substantially ramp up averages."', 'Feds close 600 weather stations amid criticism they\'re situated to report warming".'],
|
| 106 |
+
['The acceleration is making some scientists fear that Antarctica’s ice sheet may have entered the early stages of an unstoppable disintegration.', 'Scientists have found that the flow of these ice streams has accelerated in recent years, and suggested that if they were to melt, global sea levels would rise by 1 to 2\xa0m (3\xa0ft 3\xa0in to 6\xa0ft 7\xa0in), destabilising the entire West Antarctic Ice Sheet and perhaps sections of the East Antarctic Ice Sheet.'],
|
| 107 |
]
|
| 108 |
scores = model.predict(pairs)
|
| 109 |
print(scores)
|
| 110 |
+
# [0.4903 0.4453 0.5899 0.4856 0.5753]
|
| 111 |
|
| 112 |
# Or rank different texts based on similarity to a single text
|
| 113 |
ranks = model.rank(
|
| 114 |
+
'If every house in Florida had a solar-heated water tank, that would eliminate consumption by 17 percent.',
|
| 115 |
[
|
| 116 |
+
'Solar water heating (SWH) is the conversion of sunlight into heat for water heating using a solar thermal collector.',
|
| 117 |
+
'They absorb a huge amount of carbon dioxide, combating climate change.',
|
| 118 |
+
'Although some studies have reported an increase in frequency and intensity of extremes in rainfall during the past 40–50 years, their attribution to global warming is not established."',
|
| 119 |
+
'Feds close 600 weather stations amid criticism they\'re situated to report warming".',
|
| 120 |
+
'Scientists have found that the flow of these ice streams has accelerated in recent years, and suggested that if they were to melt, global sea levels would rise by 1 to 2\xa0m (3\xa0ft 3\xa0in to 6\xa0ft 7\xa0in), destabilising the entire West Antarctic Ice Sheet and perhaps sections of the East Antarctic Ice Sheet.',
|
| 121 |
]
|
| 122 |
)
|
| 123 |
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
|
|
|
|
| 156 |
* Dataset: `ce-val`
|
| 157 |
* Evaluated with [<code>CrossEncoderClassificationEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderClassificationEvaluator)
|
| 158 |
|
| 159 |
+
| Metric | Value |
|
| 160 |
+
|:----------------------|:----------|
|
| 161 |
+
| accuracy | 0.5782 |
|
| 162 |
+
| accuracy_threshold | 0.5231 |
|
| 163 |
+
| f1 | 0.6701 |
|
| 164 |
+
| f1_threshold | 0.4445 |
|
| 165 |
+
| precision | 0.5186 |
|
| 166 |
+
| recall | 0.9467 |
|
| 167 |
+
| **average_precision** | **0.567** |
|
| 168 |
|
| 169 |
<!--
|
| 170 |
## Bias, Risks and Limitations
|
|
|
|
| 190 |
| | sentence_0 | sentence_1 | label |
|
| 191 |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 192 |
| type | string | string | float |
|
| 193 |
+
| details | <ul><li>min: 7 tokens</li><li>mean: 25.97 tokens</li><li>max: 80 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 31.89 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.47</li><li>max: 1.0</li></ul> |
|
| 194 |
* Samples:
|
| 195 |
+
| sentence_0 | sentence_1 | label |
|
| 196 |
+
|:----------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
|
| 197 |
+
| <code>If every house in Florida had a solar-heated water tank, that would eliminate consumption by 17 percent.</code> | <code>Solar water heating (SWH) is the conversion of sunlight into heat for water heating using a solar thermal collector.</code> | <code>0.0</code> |
|
| 198 |
+
| <code>Modellers assume carbon dioxide drives climate change</code> | <code>They absorb a huge amount of carbon dioxide, combating climate change.</code> | <code>0.0</code> |
|
| 199 |
+
| <code>Some, however, bristle at the belief that because floods and storms have always occurred, they should not be linked to climate change”</code> | <code>Although some studies have reported an increase in frequency and intensity of extremes in rainfall during the past 40–50 years, their attribution to global warming is not established."</code> | <code>1.0</code> |
|
| 200 |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
|
| 201 |
```json
|
| 202 |
{
|
|
|
|
| 318 |
### Training Logs
|
| 319 |
| Epoch | Step | Training Loss | ce-val_average_precision |
|
| 320 |
|:------:|:----:|:-------------:|:------------------------:|
|
| 321 |
+
| 0.2695 | 500 | 0.6958 | - |
|
| 322 |
+
| 0.5391 | 1000 | 0.6883 | - |
|
| 323 |
+
| 0.8086 | 1500 | 0.6841 | - |
|
| 324 |
+
| -1 | -1 | - | 0.5670 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 325 |
|
| 326 |
|
| 327 |
### Training Time
|
| 328 |
+
- **Training**: 6.8 minutes
|
| 329 |
|
| 330 |
### Framework Versions
|
| 331 |
- Python: 3.12.13
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 737716172
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:17168fd353619c37e372ebd2c91567ea4eb245987472a8976943c375e76b1e71
|
| 3 |
size 737716172
|