ColeH0415 commited on
Commit
5a4dfa1
·
verified ·
1 Parent(s): 2307357

CE fine-tuned epoch 1/3 best_val=0.5370

Browse files
Files changed (2) hide show
  1. README.md +39 -50
  2. model.safetensors +1 -1
README.md CHANGED
@@ -28,25 +28,25 @@ model-index:
28
  type: ce-val
29
  metrics:
30
  - type: accuracy
31
- value: 0.6896969696969697
32
  name: Accuracy
33
  - type: accuracy_threshold
34
- value: 0.5847750902175903
35
  name: Accuracy Threshold
36
  - type: f1
37
- value: 0.7074707470747076
38
  name: F1
39
  - type: f1_threshold
40
- value: 0.3505881428718567
41
  name: F1 Threshold
42
  - type: precision
43
- value: 0.5630372492836676
44
  name: Precision
45
  - type: recall
46
- value: 0.9515738498789347
47
  name: Recall
48
  - type: average_precision
49
- value: 0.7391973708035351
50
  name: Average Precision
51
  ---
52
 
@@ -99,25 +99,25 @@ from sentence_transformers import CrossEncoder
99
  model = CrossEncoder("cross_encoder_model_id")
100
  # Get scores for pairs of inputs
101
  pairs = [
102
- ['The last time the planet was even four degrees warmer, Peter Brannen points out in The Ends of the World, his new history of the planet’s major extinction events, the oceans were hundreds of feet higher.', 'Almost all scientists acknowledge that the rate of species loss is greater now than at any time in human history, with extinctions occurring at rates hundreds of times higher than background extinction rates.'],
103
- ['[S]unspot activity on the surface of our star has dropped to a new low.', 'This surface activity produces starspots, which are regions of strong magnetic fields and lower than normal surface temperatures.'],
104
- ['More money is dedicated within the Department of Homeland Security to climate change than what\'s spent combating "Islamist terrorists radicalizing over the Internet in the United States of America."', "The center works on the Internet's routing infrastructure (the SPRI program) and Domain Name System (DNSSEC), identity theft and other online criminal activity (ITTC), Internet traffic and networks research (PREDICT datasets and the DETER testbed), Department of Defense and HSARPA exercises (Livewire and Determined Promise), and wireless security in cooperation with Canada."],
105
- ['Worst-case global heating scenarios may need to be revised upwards in light of a better understanding of the role of clouds, scientists have said.', 'Climate model projections summarized in the report indicated that during the 21st century the global surface temperature is likely to rise a further 0.3 to 1.7\xa0°C (0.5 to 3.1\xa0°F) in a moderate scenario, or as much as 2.6 to 4.8\xa0°C (4.7 to 8.6\xa0°F) in an extreme scenario, depending on the rate of future greenhouse gas emissions and on climate feedback effects.'],
106
- ['Prof Adam Scaife, a climate modelling expert at the UK’s Met Office, said the evidence for a link to shrinking Arctic ice was now good: ‘The consensus points towards that being a real effect.’”', 'Some models of modern climate exhibit Arctic amplification without changes in snow and ice cover.'],
107
  ]
108
  scores = model.predict(pairs)
109
  print(scores)
110
- # [0.7715 0.6352 0.7843 0.844 0.4491]
111
 
112
  # Or rank different texts based on similarity to a single text
113
  ranks = model.rank(
114
  'The last time the planet was even four degrees warmer, Peter Brannen points out in The Ends of the World, his new history of the planet’s major extinction events, the oceans were hundreds of feet higher.',
115
  [
116
- 'Almost all scientists acknowledge that the rate of species loss is greater now than at any time in human history, with extinctions occurring at rates hundreds of times higher than background extinction rates.',
117
- 'This surface activity produces starspots, which are regions of strong magnetic fields and lower than normal surface temperatures.',
118
- "The center works on the Internet's routing infrastructure (the SPRI program) and Domain Name System (DNSSEC), identity theft and other online criminal activity (ITTC), Internet traffic and networks research (PREDICT datasets and the DETER testbed), Department of Defense and HSARPA exercises (Livewire and Determined Promise), and wireless security in cooperation with Canada.",
119
- 'Climate model projections summarized in the report indicated that during the 21st century the global surface temperature is likely to rise a further 0.3 to 1.7\xa0°C (0.5 to 3.1\xa0°F) in a moderate scenario, or as much as 2.6 to 4.8\xa0°C (4.7 to 8.6\xa0°F) in an extreme scenario, depending on the rate of future greenhouse gas emissions and on climate feedback effects.',
120
- 'Some models of modern climate exhibit Arctic amplification without changes in snow and ice cover.',
121
  ]
122
  )
123
  # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
@@ -158,13 +158,13 @@ You can finetune this model on your own dataset.
158
 
159
  | Metric | Value |
160
  |:----------------------|:-----------|
161
- | accuracy | 0.6897 |
162
- | accuracy_threshold | 0.5848 |
163
- | f1 | 0.7075 |
164
- | f1_threshold | 0.3506 |
165
- | precision | 0.563 |
166
- | recall | 0.9516 |
167
- | **average_precision** | **0.7392** |
168
 
169
  <!--
170
  ## Bias, Risks and Limitations
@@ -190,13 +190,13 @@ You can finetune this model on your own dataset.
190
  | | sentence_0 | sentence_1 | label |
191
  |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------|
192
  | type | string | string | float |
193
- | details | <ul><li>min: 7 tokens</li><li>mean: 26.66 tokens</li><li>max: 80 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 31.75 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.52</li><li>max: 1.0</li></ul> |
194
  * Samples:
195
- | sentence_0 | sentence_1 | label |
196
- |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
197
- | <code>The last time the planet was even four degrees warmer, Peter Brannen points out in The Ends of the World, his new history of the planet’s major extinction events, the oceans were hundreds of feet higher.</code> | <code>Almost all scientists acknowledge that the rate of species loss is greater now than at any time in human history, with extinctions occurring at rates hundreds of times higher than background extinction rates.</code> | <code>0.0</code> |
198
- | <code>[S]unspot activity on the surface of our star has dropped to a new low.</code> | <code>This surface activity produces starspots, which are regions of strong magnetic fields and lower than normal surface temperatures.</code> | <code>1.0</code> |
199
- | <code>More money is dedicated within the Department of Homeland Security to climate change than what's spent combating "Islamist terrorists radicalizing over the Internet in the United States of America."</code> | <code>The center works on the Internet's routing infrastructure (the SPRI program) and Domain Name System (DNSSEC), identity theft and other online criminal activity (ITTC), Internet traffic and networks research (PREDICT datasets and the DETER testbed), Department of Defense and HSARPA exercises (Livewire and Determined Promise), and wireless security in cooperation with Canada.</code> | <code>1.0</code> |
200
  * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
201
  ```json
202
  {
@@ -208,8 +208,8 @@ You can finetune this model on your own dataset.
208
  ### Training Hyperparameters
209
  #### Non-Default Hyperparameters
210
 
211
- - `per_device_train_batch_size`: 4
212
- - `per_device_eval_batch_size`: 4
213
  - `num_train_epochs`: 1
214
  - `fp16`: True
215
 
@@ -218,8 +218,8 @@ You can finetune this model on your own dataset.
218
 
219
  - `do_predict`: False
220
  - `prediction_loss_only`: True
221
- - `per_device_train_batch_size`: 4
222
- - `per_device_eval_batch_size`: 4
223
  - `gradient_accumulation_steps`: 1
224
  - `eval_accumulation_steps`: None
225
  - `torch_empty_cache_steps`: None
@@ -316,24 +316,13 @@ You can finetune this model on your own dataset.
316
  </details>
317
 
318
  ### Training Logs
319
- | Epoch | Step | Training Loss | ce-val_average_precision |
320
- |:------:|:----:|:-------------:|:------------------------:|
321
- | 0.2695 | 500 | 0.6958 | - |
322
- | 0.5391 | 1000 | 0.6883 | - |
323
- | 0.8086 | 1500 | 0.6841 | - |
324
- | -1 | -1 | - | 0.5670 |
325
- | 0.2695 | 500 | 0.6741 | - |
326
- | 0.5391 | 1000 | 0.6662 | - |
327
- | 0.8086 | 1500 | 0.6504 | - |
328
- | -1 | -1 | - | 0.6569 |
329
- | 0.2695 | 500 | 0.6129 | - |
330
- | 0.5391 | 1000 | 0.6091 | - |
331
- | 0.8086 | 1500 | 0.5937 | - |
332
- | -1 | -1 | - | 0.7392 |
333
 
334
 
335
  ### Training Time
336
- - **Training**: 6.8 minutes
337
 
338
  ### Framework Versions
339
  - Python: 3.12.13
 
28
  type: ce-val
29
  metrics:
30
  - type: accuracy
31
+ value: 0.536969696969697
32
  name: Accuracy
33
  - type: accuracy_threshold
34
+ value: 0.4739919900894165
35
  name: Accuracy Threshold
36
  - type: f1
37
+ value: 0.6682966585167075
38
  name: F1
39
  - type: f1_threshold
40
+ value: 0.44792813062667847
41
  name: F1 Threshold
42
  - type: precision
43
+ value: 0.5036855036855037
44
  name: Precision
45
  - type: recall
46
+ value: 0.9927360774818402
47
  name: Recall
48
  - type: average_precision
49
+ value: 0.5140790363376013
50
  name: Average Precision
51
  ---
52
 
 
99
  model = CrossEncoder("cross_encoder_model_id")
100
  # Get scores for pairs of inputs
101
  pairs = [
102
+ ['The last time the planet was even four degrees warmer, Peter Brannen points out in The Ends of the World, his new history of the planet’s major extinction events, the oceans were hundreds of feet higher.', 'As a result, the mean annual air temperature at sea level decreases by about 0.4\xa0°C (0.7\xa0°F) per degree of latitude from the equator.'],
103
+ ["Empirical measurements of the Earth's heat content show the planet is still accumulating heat and global warming is still happening.", 'The global average and combined land and ocean surface temperature, show a warming of 0.85 [0.65 to 1.06] °C, in the period 1880 to 2012, based on multiple independently produced datasets.'],
104
+ ['Numerous case studies on both regional and global scales have determined that renewable energy, if properly implemented, can provide baseload power.', 'Currently, there are challenges implementing it on a global scale because there is no government with that power.'],
105
+ ['“Typically, in such an attribution study, scientists will use sets of climate models one set including the factors that drive human global warming and the other including purely “natural” factors and see if an event like the one in question is more likely to occur in the first set of models.', 'Attribution of the temperature change to natural or anthropogenic (i.e., human-induced) factors is an important question: see global warming and attribution of recent climate change.'],
106
+ ['The sun was warming up then, but the sun hasn’t been warming since 1970.', '"Sun\'s Shifts May Cause Global Warming".'],
107
  ]
108
  scores = model.predict(pairs)
109
  print(scores)
110
+ # [0.4804 0.4864 0.4651 0.4637 0.4819]
111
 
112
  # Or rank different texts based on similarity to a single text
113
  ranks = model.rank(
114
  'The last time the planet was even four degrees warmer, Peter Brannen points out in The Ends of the World, his new history of the planet’s major extinction events, the oceans were hundreds of feet higher.',
115
  [
116
+ 'As a result, the mean annual air temperature at sea level decreases by about 0.4\xa0°C (0.7\xa0°F) per degree of latitude from the equator.',
117
+ 'The global average and combined land and ocean surface temperature, show a warming of 0.85 [0.65 to 1.06] °C, in the period 1880 to 2012, based on multiple independently produced datasets.',
118
+ 'Currently, there are challenges implementing it on a global scale because there is no government with that power.',
119
+ 'Attribution of the temperature change to natural or anthropogenic (i.e., human-induced) factors is an important question: see global warming and attribution of recent climate change.',
120
+ '"Sun\'s Shifts May Cause Global Warming".',
121
  ]
122
  )
123
  # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
 
158
 
159
  | Metric | Value |
160
  |:----------------------|:-----------|
161
+ | accuracy | 0.537 |
162
+ | accuracy_threshold | 0.474 |
163
+ | f1 | 0.6683 |
164
+ | f1_threshold | 0.4479 |
165
+ | precision | 0.5037 |
166
+ | recall | 0.9927 |
167
+ | **average_precision** | **0.5141** |
168
 
169
  <!--
170
  ## Bias, Risks and Limitations
 
190
  | | sentence_0 | sentence_1 | label |
191
  |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------|
192
  | type | string | string | float |
193
+ | details | <ul><li>min: 7 tokens</li><li>mean: 26.35 tokens</li><li>max: 66 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 30.52 tokens</li><li>max: 145 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.47</li><li>max: 1.0</li></ul> |
194
  * Samples:
195
+ | sentence_0 | sentence_1 | label |
196
+ |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
197
+ | <code>The last time the planet was even four degrees warmer, Peter Brannen points out in The Ends of the World, his new history of the planet’s major extinction events, the oceans were hundreds of feet higher.</code> | <code>As a result, the mean annual air temperature at sea level decreases by about 0.4 °C (0.7 °F) per degree of latitude from the equator.</code> | <code>1.0</code> |
198
+ | <code>Empirical measurements of the Earth's heat content show the planet is still accumulating heat and global warming is still happening.</code> | <code>The global average and combined land and ocean surface temperature, show a warming of 0.85 [0.65 to 1.06] °C, in the period 1880 to 2012, based on multiple independently produced datasets.</code> | <code>1.0</code> |
199
+ | <code>Numerous case studies on both regional and global scales have determined that renewable energy, if properly implemented, can provide baseload power.</code> | <code>Currently, there are challenges implementing it on a global scale because there is no government with that power.</code> | <code>0.0</code> |
200
  * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
201
  ```json
202
  {
 
208
  ### Training Hyperparameters
209
  #### Non-Default Hyperparameters
210
 
211
+ - `per_device_train_batch_size`: 16
212
+ - `per_device_eval_batch_size`: 16
213
  - `num_train_epochs`: 1
214
  - `fp16`: True
215
 
 
218
 
219
  - `do_predict`: False
220
  - `prediction_loss_only`: True
221
+ - `per_device_train_batch_size`: 16
222
+ - `per_device_eval_batch_size`: 16
223
  - `gradient_accumulation_steps`: 1
224
  - `eval_accumulation_steps`: None
225
  - `torch_empty_cache_steps`: None
 
316
  </details>
317
 
318
  ### Training Logs
319
+ | Epoch | Step | ce-val_average_precision |
320
+ |:-----:|:----:|:------------------------:|
321
+ | -1 | -1 | 0.5141 |
 
 
 
 
 
 
 
 
 
 
 
322
 
323
 
324
  ### Training Time
325
+ - **Training**: 2.2 minutes
326
 
327
  ### Framework Versions
328
  - Python: 3.12.13
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a1ca6dcff0430e6f41bdfdbd69ec03387cbbb18662752f5c67a21eaf3d52ef85
3
  size 737716172
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:164bb394a54a6ca6a0e682ccee0bf2f31752f5e8c13b12eb17556e5c825dd634
3
  size 737716172