Text Ranking
sentence-transformers
Safetensors
deberta-v2
cross-encoder
reranker
Generated from Trainer
dataset_size:7419
loss:BinaryCrossEntropyLoss
Eval Results (legacy)
text-embeddings-inference
Instructions to use ColeH0415/comp90042-crossencoder-factcheck with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use ColeH0415/comp90042-crossencoder-factcheck with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("ColeH0415/comp90042-crossencoder-factcheck") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
CE fine-tuned epoch 2/3
Browse files- README.md +37 -37
- eval/CrossEncoderClassificationEvaluator_ce-val_results.csv +1 -0
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -28,25 +28,25 @@ model-index:
|
|
| 28 |
type: ce-val
|
| 29 |
metrics:
|
| 30 |
- type: accuracy
|
| 31 |
-
value: 0.
|
| 32 |
name: Accuracy
|
| 33 |
- type: accuracy_threshold
|
| 34 |
-
value: -0.
|
| 35 |
name: Accuracy Threshold
|
| 36 |
- type: f1
|
| 37 |
-
value: 0.
|
| 38 |
name: F1
|
| 39 |
- type: f1_threshold
|
| 40 |
-
value: -
|
| 41 |
name: F1 Threshold
|
| 42 |
- type: precision
|
| 43 |
-
value: 0.
|
| 44 |
name: Precision
|
| 45 |
- type: recall
|
| 46 |
-
value: 0.
|
| 47 |
name: Recall
|
| 48 |
- type: average_precision
|
| 49 |
-
value: 0.
|
| 50 |
name: Average Precision
|
| 51 |
---
|
| 52 |
|
|
@@ -99,25 +99,25 @@ from sentence_transformers import CrossEncoder
|
|
| 99 |
model = CrossEncoder("cross_encoder_model_id")
|
| 100 |
# Get scores for pairs of inputs
|
| 101 |
pairs = [
|
| 102 |
-
['
|
| 103 |
-
['
|
| 104 |
-
['
|
| 105 |
-
['
|
| 106 |
-
['
|
| 107 |
]
|
| 108 |
scores = model.predict(pairs)
|
| 109 |
print(scores)
|
| 110 |
-
# [ 3.
|
| 111 |
|
| 112 |
# Or rank different texts based on similarity to a single text
|
| 113 |
ranks = model.rank(
|
| 114 |
-
'
|
| 115 |
[
|
| 116 |
-
"
|
| 117 |
-
'
|
| 118 |
-
'
|
| 119 |
-
'
|
| 120 |
-
'
|
| 121 |
]
|
| 122 |
)
|
| 123 |
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
|
|
@@ -158,13 +158,13 @@ You can finetune this model on your own dataset.
|
|
| 158 |
|
| 159 |
| Metric | Value |
|
| 160 |
|:----------------------|:-----------|
|
| 161 |
-
| accuracy | 0.
|
| 162 |
-
| accuracy_threshold | -0.
|
| 163 |
-
| f1 | 0.
|
| 164 |
-
| f1_threshold | -
|
| 165 |
-
| precision | 0.
|
| 166 |
-
| recall | 0.
|
| 167 |
-
| **average_precision** | **0.
|
| 168 |
|
| 169 |
<!--
|
| 170 |
## Bias, Risks and Limitations
|
|
@@ -187,16 +187,16 @@ You can finetune this model on your own dataset.
|
|
| 187 |
* Size: 4,815 training samples
|
| 188 |
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
|
| 189 |
* Approximate statistics based on the first 1000 samples:
|
| 190 |
-
| | sentence_0
|
| 191 |
-
|:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------
|
| 192 |
-
| type | string
|
| 193 |
-
| details | <ul><li>min: 7 tokens</li><li>mean:
|
| 194 |
* Samples:
|
| 195 |
-
| sentence_0
|
| 196 |
-
|:----------------------------------------------------------------------------------------------------------------------------------------------------------------
|
| 197 |
-
| <code>
|
| 198 |
-
| <code>
|
| 199 |
-
| <code>
|
| 200 |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
|
| 201 |
```json
|
| 202 |
{
|
|
@@ -317,11 +317,11 @@ You can finetune this model on your own dataset.
|
|
| 317 |
### Training Logs
|
| 318 |
| Epoch | Step | ce-val_average_precision |
|
| 319 |
|:-----:|:----:|:------------------------:|
|
| 320 |
-
| 1.0 | 301 | 0.
|
| 321 |
|
| 322 |
|
| 323 |
### Training Time
|
| 324 |
-
- **Training**:
|
| 325 |
|
| 326 |
### Framework Versions
|
| 327 |
- Python: 3.12.13
|
|
|
|
| 28 |
type: ce-val
|
| 29 |
metrics:
|
| 30 |
- type: accuracy
|
| 31 |
+
value: 0.8299065420560747
|
| 32 |
name: Accuracy
|
| 33 |
- type: accuracy_threshold
|
| 34 |
+
value: -0.3896324336528778
|
| 35 |
name: Accuracy Threshold
|
| 36 |
- type: f1
|
| 37 |
+
value: 0.8914285714285715
|
| 38 |
name: F1
|
| 39 |
- type: f1_threshold
|
| 40 |
+
value: -1.0047539472579956
|
| 41 |
name: F1 Threshold
|
| 42 |
- type: precision
|
| 43 |
+
value: 0.8351177730192719
|
| 44 |
name: Precision
|
| 45 |
- type: recall
|
| 46 |
+
value: 0.9558823529411765
|
| 47 |
name: Recall
|
| 48 |
- type: average_precision
|
| 49 |
+
value: 0.9535681876950206
|
| 50 |
name: Average Precision
|
| 51 |
---
|
| 52 |
|
|
|
|
| 99 |
model = CrossEncoder("cross_encoder_model_id")
|
| 100 |
# Get scores for pairs of inputs
|
| 101 |
pairs = [
|
| 102 |
+
['Climate change is a hoax invented by the Chinese.', '"The concept of global warming was created by and for the Chinese in order to make U.S. manufacturing non-competitive" (Tweet).'],
|
| 103 |
+
['“The oceans, which absorb more than 90% of the extra CO2 pumped into the atmosphere“', 'Most of the CO 2 taken up by the ocean, which is about 30% of the total released into the atmosphere, forms carbonic acid in equilibrium with bicarbonate.'],
|
| 104 |
+
['“The jet stream forms a boundary between the cold north and the warmer south, but the lower temperature difference means the winds are now weaker.', 'Therefore, the strong eastward moving jet streams are in part a simple consequence of the fact that the Equator is warmer than the North and South poles.'],
|
| 105 |
+
['climate models predict too much warming in the troposphere', 'While the satellite data now show global warming, there is still some difference between what climate models predict and what the satellite data show for warming of the lower troposphere, with the climate models predicting slightly more warming than what the satellites measure.'],
|
| 106 |
+
['Nine years into that 11-year hurricane drought, a NASA scientist computed it as a 1-in-177-year event.', 'It is approximately 177 light years from the Earth.'],
|
| 107 |
]
|
| 108 |
scores = model.predict(pairs)
|
| 109 |
print(scores)
|
| 110 |
+
# [ 3.6842 2.0192 1.9189 -0.9618 0.3676]
|
| 111 |
|
| 112 |
# Or rank different texts based on similarity to a single text
|
| 113 |
ranks = model.rank(
|
| 114 |
+
'Climate change is a hoax invented by the Chinese.',
|
| 115 |
[
|
| 116 |
+
'"The concept of global warming was created by and for the Chinese in order to make U.S. manufacturing non-competitive" (Tweet).',
|
| 117 |
+
'Most of the CO 2 taken up by the ocean, which is about 30% of the total released into the atmosphere, forms carbonic acid in equilibrium with bicarbonate.',
|
| 118 |
+
'Therefore, the strong eastward moving jet streams are in part a simple consequence of the fact that the Equator is warmer than the North and South poles.',
|
| 119 |
+
'While the satellite data now show global warming, there is still some difference between what climate models predict and what the satellite data show for warming of the lower troposphere, with the climate models predicting slightly more warming than what the satellites measure.',
|
| 120 |
+
'It is approximately 177 light years from the Earth.',
|
| 121 |
]
|
| 122 |
)
|
| 123 |
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
|
|
|
|
| 158 |
|
| 159 |
| Metric | Value |
|
| 160 |
|:----------------------|:-----------|
|
| 161 |
+
| accuracy | 0.8299 |
|
| 162 |
+
| accuracy_threshold | -0.3896 |
|
| 163 |
+
| f1 | 0.8914 |
|
| 164 |
+
| f1_threshold | -1.0048 |
|
| 165 |
+
| precision | 0.8351 |
|
| 166 |
+
| recall | 0.9559 |
|
| 167 |
+
| **average_precision** | **0.9536** |
|
| 168 |
|
| 169 |
<!--
|
| 170 |
## Bias, Risks and Limitations
|
|
|
|
| 187 |
* Size: 4,815 training samples
|
| 188 |
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
|
| 189 |
* Approximate statistics based on the first 1000 samples:
|
| 190 |
+
| | sentence_0 | sentence_1 | label |
|
| 191 |
+
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 192 |
+
| type | string | string | float |
|
| 193 |
+
| details | <ul><li>min: 7 tokens</li><li>mean: 27.21 tokens</li><li>max: 73 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 35.1 tokens</li><li>max: 333 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.78</li><li>max: 1.0</li></ul> |
|
| 194 |
* Samples:
|
| 195 |
+
| sentence_0 | sentence_1 | label |
|
| 196 |
+
|:----------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
|
| 197 |
+
| <code>Climate change is a hoax invented by the Chinese.</code> | <code>"The concept of global warming was created by and for the Chinese in order to make U.S. manufacturing non-competitive" (Tweet).</code> | <code>1.0</code> |
|
| 198 |
+
| <code>“The oceans, which absorb more than 90% of the extra CO2 pumped into the atmosphere“</code> | <code>Most of the CO 2 taken up by the ocean, which is about 30% of the total released into the atmosphere, forms carbonic acid in equilibrium with bicarbonate.</code> | <code>1.0</code> |
|
| 199 |
+
| <code>“The jet stream forms a boundary between the cold north and the warmer south, but the lower temperature difference means the winds are now weaker.</code> | <code>Therefore, the strong eastward moving jet streams are in part a simple consequence of the fact that the Equator is warmer than the North and South poles.</code> | <code>1.0</code> |
|
| 200 |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
|
| 201 |
```json
|
| 202 |
{
|
|
|
|
| 317 |
### Training Logs
|
| 318 |
| Epoch | Step | ce-val_average_precision |
|
| 319 |
|:-----:|:----:|:------------------------:|
|
| 320 |
+
| 1.0 | 301 | 0.9536 |
|
| 321 |
|
| 322 |
|
| 323 |
### Training Time
|
| 324 |
+
- **Training**: 27.9 seconds
|
| 325 |
|
| 326 |
### Framework Versions
|
| 327 |
- Python: 3.12.13
|
eval/CrossEncoderClassificationEvaluator_ce-val_results.csv
CHANGED
|
@@ -1,2 +1,3 @@
|
|
| 1 |
epoch,steps,Accuracy,Accuracy_Threshold,F1,F1_Threshold,Precision,Recall,Average_Precision
|
| 2 |
1.0,301,0.8130841121495327,-0.25794858,0.8845265588914549,-0.43757033,0.8362445414847162,0.9387254901960784,0.9335939639352934
|
|
|
|
|
|
| 1 |
epoch,steps,Accuracy,Accuracy_Threshold,F1,F1_Threshold,Precision,Recall,Average_Precision
|
| 2 |
1.0,301,0.8130841121495327,-0.25794858,0.8845265588914549,-0.43757033,0.8362445414847162,0.9387254901960784,0.9335939639352934
|
| 3 |
+
1.0,301,0.8299065420560747,-0.38963243,0.8914285714285715,-1.004754,0.8351177730192719,0.9558823529411765,0.9535681876950206
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 90866404
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9e0ca103c6d4a5cde8fa579d1f6f9007149e9cede51d84a9a59e524ea135df68
|
| 3 |
size 90866404
|