ColeH0415 commited on
Commit
ed61a6f
·
verified ·
1 Parent(s): 3c79fcd

CE fine-tuned epoch 1/3 best_val=0.5782

Browse files
Files changed (2) hide show
  1. README.md +39 -47
  2. model.safetensors +1 -1
README.md CHANGED
@@ -28,25 +28,25 @@ model-index:
28
  type: ce-val
29
  metrics:
30
  - type: accuracy
31
- value: 0.6387878787878788
32
  name: Accuracy
33
  - type: accuracy_threshold
34
- value: 0.47375625371932983
35
  name: Accuracy Threshold
36
  - type: f1
37
- value: 0.6906474820143886
38
  name: F1
39
  - type: f1_threshold
40
- value: 0.3463754653930664
41
  name: F1 Threshold
42
  - type: precision
43
- value: 0.5493562231759657
44
  name: Precision
45
  - type: recall
46
- value: 0.9297820823244553
47
  name: Recall
48
  - type: average_precision
49
- value: 0.6675220045050669
50
  name: Average Precision
51
  ---
52
 
@@ -99,25 +99,25 @@ from sentence_transformers import CrossEncoder
99
  model = CrossEncoder("cross_encoder_model_id")
100
  # Get scores for pairs of inputs
101
  pairs = [
102
- ['The last time the planet was even four degrees warmer, Peter Brannen points out in The Ends of the World, his new history of the planet’s major extinction events, the oceans were hundreds of feet higher.', 'Almost all scientists acknowledge that the rate of species loss is greater now than at any time in human history, with extinctions occurring at rates hundreds of times higher than background extinction rates.'],
103
- ['[S]unspot activity on the surface of our star has dropped to a new low.', 'At solar-cycle minimum, the toroidal field is, correspondingly, at minimum strength, sunspots are relatively rare, and the poloidal field is at its maximum strength.'],
104
- ['More money is dedicated within the Department of Homeland Security to climate change than what\'s spent combating "Islamist terrorists radicalizing over the Internet in the United States of America."', 'Homeland security is officially defined by the National Strategy for Homeland Security as "a concerted national effort to prevent terrorist attacks within the United States, reduce America\'s vulnerability to terrorism, and minimize the damage and recover from attacks that do occur".'],
105
- ['Worst-case global heating scenarios may need to be revised upwards in light of a better understanding of the role of clouds, scientists have said.', 'With this information, scientists can produce scenarios of how greenhouse gas emissions may vary in the future.'],
106
- ['Prof Adam Scaife, a climate modelling expert at the UK’s Met Office, said the evidence for a link to shrinking Arctic ice was now good: ‘The consensus points towards that being a real effect.’”', 'Some models of modern climate exhibit Arctic amplification without changes in snow and ice cover.'],
107
  ]
108
  scores = model.predict(pairs)
109
  print(scores)
110
- # [0.7955 0.4447 0.5575 0.3427 0.3421]
111
 
112
  # Or rank different texts based on similarity to a single text
113
  ranks = model.rank(
114
- 'The last time the planet was even four degrees warmer, Peter Brannen points out in The Ends of the World, his new history of the planet’s major extinction events, the oceans were hundreds of feet higher.',
115
  [
116
- 'Almost all scientists acknowledge that the rate of species loss is greater now than at any time in human history, with extinctions occurring at rates hundreds of times higher than background extinction rates.',
117
- 'At solar-cycle minimum, the toroidal field is, correspondingly, at minimum strength, sunspots are relatively rare, and the poloidal field is at its maximum strength.',
118
- 'Homeland security is officially defined by the National Strategy for Homeland Security as "a concerted national effort to prevent terrorist attacks within the United States, reduce America\'s vulnerability to terrorism, and minimize the damage and recover from attacks that do occur".',
119
- 'With this information, scientists can produce scenarios of how greenhouse gas emissions may vary in the future.',
120
- 'Some models of modern climate exhibit Arctic amplification without changes in snow and ice cover.',
121
  ]
122
  )
123
  # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
@@ -156,15 +156,15 @@ You can finetune this model on your own dataset.
156
  * Dataset: `ce-val`
157
  * Evaluated with [<code>CrossEncoderClassificationEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderClassificationEvaluator)
158
 
159
- | Metric | Value |
160
- |:----------------------|:-----------|
161
- | accuracy | 0.6388 |
162
- | accuracy_threshold | 0.4738 |
163
- | f1 | 0.6906 |
164
- | f1_threshold | 0.3464 |
165
- | precision | 0.5494 |
166
- | recall | 0.9298 |
167
- | **average_precision** | **0.6675** |
168
 
169
  <!--
170
  ## Bias, Risks and Limitations
@@ -190,13 +190,13 @@ You can finetune this model on your own dataset.
190
  | | sentence_0 | sentence_1 | label |
191
  |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------|
192
  | type | string | string | float |
193
- | details | <ul><li>min: 7 tokens</li><li>mean: 26.66 tokens</li><li>max: 80 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 31.89 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.52</li><li>max: 1.0</li></ul> |
194
  * Samples:
195
- | sentence_0 | sentence_1 | label |
196
- |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
197
- | <code>The last time the planet was even four degrees warmer, Peter Brannen points out in The Ends of the World, his new history of the planet’s major extinction events, the oceans were hundreds of feet higher.</code> | <code>Almost all scientists acknowledge that the rate of species loss is greater now than at any time in human history, with extinctions occurring at rates hundreds of times higher than background extinction rates.</code> | <code>0.0</code> |
198
- | <code>[S]unspot activity on the surface of our star has dropped to a new low.</code> | <code>At solar-cycle minimum, the toroidal field is, correspondingly, at minimum strength, sunspots are relatively rare, and the poloidal field is at its maximum strength.</code> | <code>1.0</code> |
199
- | <code>More money is dedicated within the Department of Homeland Security to climate change than what's spent combating "Islamist terrorists radicalizing over the Internet in the United States of America."</code> | <code>Homeland security is officially defined by the National Strategy for Homeland Security as "a concerted national effort to prevent terrorist attacks within the United States, reduce America's vulnerability to terrorism, and minimize the damage and recover from attacks that do occur".</code> | <code>1.0</code> |
200
  * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
201
  ```json
202
  {
@@ -318,22 +318,14 @@ You can finetune this model on your own dataset.
318
  ### Training Logs
319
  | Epoch | Step | Training Loss | ce-val_average_precision |
320
  |:------:|:----:|:-------------:|:------------------------:|
321
- | 0.2695 | 500 | 0.7103 | - |
322
- | 0.5391 | 1000 | 0.6983 | - |
323
- | 0.8086 | 1500 | 0.6982 | - |
324
- | -1 | -1 | - | 0.5657 |
325
- | 0.2695 | 500 | 0.6835 | - |
326
- | 0.5391 | 1000 | 0.6840 | - |
327
- | 0.8086 | 1500 | 0.6848 | - |
328
- | -1 | -1 | - | 0.6083 |
329
- | 0.2695 | 500 | 0.6379 | - |
330
- | 0.5391 | 1000 | 0.6424 | - |
331
- | 0.8086 | 1500 | 0.6501 | - |
332
- | -1 | -1 | - | 0.6675 |
333
 
334
 
335
  ### Training Time
336
- - **Training**: 7.1 minutes
337
 
338
  ### Framework Versions
339
  - Python: 3.12.13
 
28
  type: ce-val
29
  metrics:
30
  - type: accuracy
31
+ value: 0.5781818181818181
32
  name: Accuracy
33
  - type: accuracy_threshold
34
+ value: 0.5230777859687805
35
  name: Accuracy Threshold
36
  - type: f1
37
+ value: 0.6700942587832047
38
  name: F1
39
  - type: f1_threshold
40
+ value: 0.4445345997810364
41
  name: F1 Threshold
42
  - type: precision
43
+ value: 0.5185676392572944
44
  name: Precision
45
  - type: recall
46
+ value: 0.9467312348668281
47
  name: Recall
48
  - type: average_precision
49
+ value: 0.5669943455221101
50
  name: Average Precision
51
  ---
52
 
 
99
  model = CrossEncoder("cross_encoder_model_id")
100
  # Get scores for pairs of inputs
101
  pairs = [
102
+ ['If every house in Florida had a solar-heated water tank, that would eliminate consumption by 17 percent.', 'Solar water heating (SWH) is the conversion of sunlight into heat for water heating using a solar thermal collector.'],
103
+ ['Modellers assume carbon dioxide drives climate change', 'They absorb a huge amount of carbon dioxide, combating climate change.'],
104
+ ['Some, however, bristle at the belief that because floods and storms have always occurred, they should not be linked to climate change”', 'Although some studies have reported an increase in frequency and intensity of extremes in rainfall during the past 40–50 years, their attribution to global warming is not established."'],
105
+ ['The tax-payer funded National Oceanic and Atmospheric Administration (NOAA) has become mired in fresh global warming data scandal involving numbers for the Great Lakes region that substantially ramp up averages."', 'Feds close 600 weather stations amid criticism they\'re situated to report warming".'],
106
+ ['The acceleration is making some scientists fear that Antarctica’s ice sheet may have entered the early stages of an unstoppable disintegration.', 'Scientists have found that the flow of these ice streams has accelerated in recent years, and suggested that if they were to melt, global sea levels would rise by 1 to 2\xa0m (3\xa0ft 3\xa0in to 6\xa0ft 7\xa0in), destabilising the entire West Antarctic Ice Sheet and perhaps sections of the East Antarctic Ice Sheet.'],
107
  ]
108
  scores = model.predict(pairs)
109
  print(scores)
110
+ # [0.4903 0.4453 0.5899 0.4856 0.5753]
111
 
112
  # Or rank different texts based on similarity to a single text
113
  ranks = model.rank(
114
+ 'If every house in Florida had a solar-heated water tank, that would eliminate consumption by 17 percent.',
115
  [
116
+ 'Solar water heating (SWH) is the conversion of sunlight into heat for water heating using a solar thermal collector.',
117
+ 'They absorb a huge amount of carbon dioxide, combating climate change.',
118
+ 'Although some studies have reported an increase in frequency and intensity of extremes in rainfall during the past 40–50 years, their attribution to global warming is not established."',
119
+ 'Feds close 600 weather stations amid criticism they\'re situated to report warming".',
120
+ 'Scientists have found that the flow of these ice streams has accelerated in recent years, and suggested that if they were to melt, global sea levels would rise by 1 to 2\xa0m (3\xa0ft 3\xa0in to 6\xa0ft 7\xa0in), destabilising the entire West Antarctic Ice Sheet and perhaps sections of the East Antarctic Ice Sheet.',
121
  ]
122
  )
123
  # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
 
156
  * Dataset: `ce-val`
157
  * Evaluated with [<code>CrossEncoderClassificationEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderClassificationEvaluator)
158
 
159
+ | Metric | Value |
160
+ |:----------------------|:----------|
161
+ | accuracy | 0.5782 |
162
+ | accuracy_threshold | 0.5231 |
163
+ | f1 | 0.6701 |
164
+ | f1_threshold | 0.4445 |
165
+ | precision | 0.5186 |
166
+ | recall | 0.9467 |
167
+ | **average_precision** | **0.567** |
168
 
169
  <!--
170
  ## Bias, Risks and Limitations
 
190
  | | sentence_0 | sentence_1 | label |
191
  |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------|
192
  | type | string | string | float |
193
+ | details | <ul><li>min: 7 tokens</li><li>mean: 25.97 tokens</li><li>max: 80 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 31.89 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.47</li><li>max: 1.0</li></ul> |
194
  * Samples:
195
+ | sentence_0 | sentence_1 | label |
196
+ |:----------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
197
+ | <code>If every house in Florida had a solar-heated water tank, that would eliminate consumption by 17 percent.</code> | <code>Solar water heating (SWH) is the conversion of sunlight into heat for water heating using a solar thermal collector.</code> | <code>0.0</code> |
198
+ | <code>Modellers assume carbon dioxide drives climate change</code> | <code>They absorb a huge amount of carbon dioxide, combating climate change.</code> | <code>0.0</code> |
199
+ | <code>Some, however, bristle at the belief that because floods and storms have always occurred, they should not be linked to climate change”</code> | <code>Although some studies have reported an increase in frequency and intensity of extremes in rainfall during the past 40–50 years, their attribution to global warming is not established."</code> | <code>1.0</code> |
200
  * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
201
  ```json
202
  {
 
318
  ### Training Logs
319
  | Epoch | Step | Training Loss | ce-val_average_precision |
320
  |:------:|:----:|:-------------:|:------------------------:|
321
+ | 0.2695 | 500 | 0.6958 | - |
322
+ | 0.5391 | 1000 | 0.6883 | - |
323
+ | 0.8086 | 1500 | 0.6841 | - |
324
+ | -1 | -1 | - | 0.5670 |
 
 
 
 
 
 
 
 
325
 
326
 
327
  ### Training Time
328
+ - **Training**: 6.8 minutes
329
 
330
  ### Framework Versions
331
  - Python: 3.12.13
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5a2701e5f38f5d50dbd4c46ce52b54e777353e7acb8e142bd9753b4caa41bf8c
3
  size 737716172
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:17168fd353619c37e372ebd2c91567ea4eb245987472a8976943c375e76b1e71
3
  size 737716172