vdmbrsv commited on
Commit
f30f9b2
·
verified ·
1 Parent(s): fa7aca3

Upload model

Browse files
best_results.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "sts_spearman": 0.7992766546129851,
3
+ "sts_pearson": 0.7723631373159269,
4
+ "retrieval_recall_at_1": 0.668,
5
+ "retrieval_recall_at_5": 0.936,
6
+ "retrieval_recall_at_10": 0.976,
7
+ "nli_accuracy": 0.5,
8
+ "nli_similarity": 0.7214555740356445,
9
+ "paraphrase_accuracy": 0.5,
10
+ "paraphrase_f1": 0.6666666666666666,
11
+ "paraphrase_similarity": 0.8633242845535278,
12
+ "composite_score": 0.7971049939731593
13
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3bb65b6068d36dc54866d2c620a2ec0416e28af5f46cabd84edb29d9bcbb2d5d
3
  size 4082832
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d50a170ba9a88a50f8b72b92946ad32211f1d42f16bee9abb8a7019f3e44f8f8
3
  size 4082832
training_history.json ADDED
@@ -0,0 +1,2496 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "epoch": 1,
4
+ "eval_results": {
5
+ "sts_spearman": 0.8289338745472172,
6
+ "sts_pearson": 0.8098611426918698,
7
+ "retrieval_recall_at_1": 0.428,
8
+ "retrieval_recall_at_5": 0.746,
9
+ "retrieval_recall_at_10": 0.85,
10
+ "nli_accuracy": 0.5,
11
+ "nli_similarity": 0.7193673849105835,
12
+ "paraphrase_accuracy": 0.5,
13
+ "paraphrase_f1": 0.6666666666666666,
14
+ "paraphrase_similarity": 0.8657987713813782,
15
+ "composite_score": 0.7549336039402753
16
+ },
17
+ "losses": {
18
+ "sts": {
19
+ "total": 0.09194075802098149,
20
+ "distill": 0.07957400041429893,
21
+ "task": 0.2431019875018493,
22
+ "count": 23
23
+ },
24
+ "retrieval": {
25
+ "total": 0.268772747288359,
26
+ "distill": 0.0272021861945061,
27
+ "task": 1.241009996292439,
28
+ "count": 47
29
+ },
30
+ "nli": {
31
+ "total": 0.3314921881290192,
32
+ "distill": 0.01526121862549731,
33
+ "task": 2.3350987434387207,
34
+ "count": 47
35
+ },
36
+ "paraphrase": {
37
+ "total": 0.060653433203697205,
38
+ "distill": 0.008428953401744366,
39
+ "task": 0.8303535401821136,
40
+ "count": 10
41
+ }
42
+ },
43
+ "distill_weight": 0.3
44
+ },
45
+ {
46
+ "epoch": 2,
47
+ "eval_results": {
48
+ "sts_spearman": 0.8275112970530258,
49
+ "sts_pearson": 0.8086337285455611,
50
+ "retrieval_recall_at_1": 0.452,
51
+ "retrieval_recall_at_5": 0.78,
52
+ "retrieval_recall_at_10": 0.876,
53
+ "nli_accuracy": 0.5,
54
+ "nli_similarity": 0.7263807654380798,
55
+ "paraphrase_accuracy": 0.5,
56
+ "paraphrase_f1": 0.6666666666666666,
57
+ "paraphrase_similarity": 0.8608207106590271,
58
+ "composite_score": 0.7644223151931796
59
+ },
60
+ "losses": {
61
+ "sts": {
62
+ "total": 0.05613501726285271,
63
+ "distill": 0.00838314470551584,
64
+ "task": 0.19135420996209848,
65
+ "count": 23
66
+ },
67
+ "retrieval": {
68
+ "total": 0.21614952480539362,
69
+ "distill": 0.006375962016271784,
70
+ "task": 1.019319458210722,
71
+ "count": 47
72
+ },
73
+ "nli": {
74
+ "total": 0.30814098867964235,
75
+ "distill": 0.004959808186964786,
76
+ "task": 2.188524269043131,
77
+ "count": 47
78
+ },
79
+ "paraphrase": {
80
+ "total": 0.051446066424250605,
81
+ "distill": 0.0038678635377436877,
82
+ "task": 0.7177851617336273,
83
+ "count": 10
84
+ }
85
+ },
86
+ "distill_weight": 0.2994
87
+ },
88
+ {
89
+ "epoch": 3,
90
+ "eval_results": {
91
+ "sts_spearman": 0.827182891132439,
92
+ "sts_pearson": 0.809022237364101,
93
+ "retrieval_recall_at_1": 0.47,
94
+ "retrieval_recall_at_5": 0.8,
95
+ "retrieval_recall_at_10": 0.9,
96
+ "nli_accuracy": 0.5,
97
+ "nli_similarity": 0.7274820804595947,
98
+ "paraphrase_accuracy": 0.5,
99
+ "paraphrase_f1": 0.6666666666666666,
100
+ "paraphrase_similarity": 0.8593040704727173,
101
+ "composite_score": 0.7702581122328862
102
+ },
103
+ "losses": {
104
+ "sts": {
105
+ "total": 0.05474994104841481,
106
+ "distill": 0.0038318179709755855,
107
+ "task": 0.191118774206742,
108
+ "count": 23
109
+ },
110
+ "retrieval": {
111
+ "total": 0.1865766501807152,
112
+ "distill": 0.0033635680979870737,
113
+ "task": 0.8821620687525323,
114
+ "count": 47
115
+ },
116
+ "nli": {
117
+ "total": 0.2887607974574921,
118
+ "distill": 0.002587009018207801,
119
+ "task": 2.0535353625074344,
120
+ "count": 47
121
+ },
122
+ "paraphrase": {
123
+ "total": 0.04753798432648182,
124
+ "distill": 0.0023388142231851815,
125
+ "task": 0.6679855525493622,
126
+ "count": 10
127
+ }
128
+ },
129
+ "distill_weight": 0.2988
130
+ },
131
+ {
132
+ "epoch": 4,
133
+ "eval_results": {
134
+ "sts_spearman": 0.8273207309309314,
135
+ "sts_pearson": 0.8086748676612024,
136
+ "retrieval_recall_at_1": 0.496,
137
+ "retrieval_recall_at_5": 0.822,
138
+ "retrieval_recall_at_10": 0.914,
139
+ "nli_accuracy": 0.5,
140
+ "nli_similarity": 0.7310156226158142,
141
+ "paraphrase_accuracy": 0.5,
142
+ "paraphrase_f1": 0.6666666666666666,
143
+ "paraphrase_similarity": 0.8595395684242249,
144
+ "composite_score": 0.7769270321321324
145
+ },
146
+ "losses": {
147
+ "sts": {
148
+ "total": 0.05435633513590564,
149
+ "distill": 0.0022464350635266824,
150
+ "task": 0.19124554292015408,
151
+ "count": 23
152
+ },
153
+ "retrieval": {
154
+ "total": 0.16951107297171938,
155
+ "distill": 0.0021174795668017356,
156
+ "task": 0.8021261704728958,
157
+ "count": 47
158
+ },
159
+ "nli": {
160
+ "total": 0.280746913019647,
161
+ "distill": 0.0015788413361309374,
162
+ "task": 1.9968374790029322,
163
+ "count": 47
164
+ },
165
+ "paraphrase": {
166
+ "total": 0.044596540927886966,
167
+ "distill": 0.0015972736524417996,
168
+ "task": 0.628672468662262,
169
+ "count": 10
170
+ }
171
+ },
172
+ "distill_weight": 0.29819999999999997
173
+ },
174
+ {
175
+ "epoch": 5,
176
+ "eval_results": {
177
+ "sts_spearman": 0.8252767207778248,
178
+ "sts_pearson": 0.8062386165350521,
179
+ "retrieval_recall_at_1": 0.51,
180
+ "retrieval_recall_at_5": 0.828,
181
+ "retrieval_recall_at_10": 0.922,
182
+ "nli_accuracy": 0.5,
183
+ "nli_similarity": 0.7306257486343384,
184
+ "paraphrase_accuracy": 0.5,
185
+ "paraphrase_f1": 0.6666666666666666,
186
+ "paraphrase_similarity": 0.8570234775543213,
187
+ "composite_score": 0.7777050270555791
188
+ },
189
+ "losses": {
190
+ "sts": {
191
+ "total": 0.05338770879999451,
192
+ "distill": 0.0014984115867106163,
193
+ "task": 0.1884317449901415,
194
+ "count": 23
195
+ },
196
+ "retrieval": {
197
+ "total": 0.15480095338314137,
198
+ "distill": 0.0014994340983437414,
199
+ "task": 0.7325110093076178,
200
+ "count": 47
201
+ },
202
+ "nli": {
203
+ "total": 0.27232939100011866,
204
+ "distill": 0.0010938003134140944,
205
+ "task": 1.936246319020048,
206
+ "count": 47
207
+ },
208
+ "paraphrase": {
209
+ "total": 0.04112741462886334,
210
+ "distill": 0.0012359182350337506,
211
+ "task": 0.5802905261516571,
212
+ "count": 10
213
+ }
214
+ },
215
+ "distill_weight": 0.2976
216
+ },
217
+ {
218
+ "epoch": 6,
219
+ "eval_results": {
220
+ "sts_spearman": 0.8248752970132536,
221
+ "sts_pearson": 0.805502215995378,
222
+ "retrieval_recall_at_1": 0.524,
223
+ "retrieval_recall_at_5": 0.83,
224
+ "retrieval_recall_at_10": 0.922,
225
+ "nli_accuracy": 0.5,
226
+ "nli_similarity": 0.731809675693512,
227
+ "paraphrase_accuracy": 0.5,
228
+ "paraphrase_f1": 0.6666666666666666,
229
+ "paraphrase_similarity": 0.8566122651100159,
230
+ "composite_score": 0.7781043151732935
231
+ },
232
+ "losses": {
233
+ "sts": {
234
+ "total": 0.05317179696715396,
235
+ "distill": 0.0011427759790145185,
236
+ "task": 0.1878819122262623,
237
+ "count": 23
238
+ },
239
+ "retrieval": {
240
+ "total": 0.14514635503292084,
241
+ "distill": 0.001184225406874209,
242
+ "task": 0.6865559283723223,
243
+ "count": 47
244
+ },
245
+ "nli": {
246
+ "total": 0.2656283996840741,
247
+ "distill": 0.0008388215606596242,
248
+ "task": 1.8874770808727184,
249
+ "count": 47
250
+ },
251
+ "paraphrase": {
252
+ "total": 0.038727284595370295,
253
+ "distill": 0.0010345598682761192,
254
+ "task": 0.546515229344368,
255
+ "count": 10
256
+ }
257
+ },
258
+ "distill_weight": 0.297
259
+ },
260
+ {
261
+ "epoch": 7,
262
+ "eval_results": {
263
+ "sts_spearman": 0.82402492098156,
264
+ "sts_pearson": 0.8039468293688954,
265
+ "retrieval_recall_at_1": 0.538,
266
+ "retrieval_recall_at_5": 0.842,
267
+ "retrieval_recall_at_10": 0.926,
268
+ "nli_accuracy": 0.5,
269
+ "nli_similarity": 0.7345654368400574,
270
+ "paraphrase_accuracy": 0.5,
271
+ "paraphrase_f1": 0.6666666666666666,
272
+ "paraphrase_similarity": 0.8579738140106201,
273
+ "composite_score": 0.7812791271574466
274
+ },
275
+ "losses": {
276
+ "sts": {
277
+ "total": 0.052247073501348495,
278
+ "distill": 0.0009337503781906613,
279
+ "task": 0.18465858309165292,
280
+ "count": 23
281
+ },
282
+ "retrieval": {
283
+ "total": 0.13885294297274123,
284
+ "distill": 0.0010064611387280548,
285
+ "task": 0.6564081304884971,
286
+ "count": 47
287
+ },
288
+ "nli": {
289
+ "total": 0.2596762865147692,
290
+ "distill": 0.0006988341003616756,
291
+ "task": 1.8438684280882491,
292
+ "count": 47
293
+ },
294
+ "paraphrase": {
295
+ "total": 0.03767538573592901,
296
+ "distill": 0.0009226226538885385,
297
+ "task": 0.5315793305635452,
298
+ "count": 10
299
+ }
300
+ },
301
+ "distill_weight": 0.2964
302
+ },
303
+ {
304
+ "epoch": 8,
305
+ "eval_results": {
306
+ "sts_spearman": 0.823563554115123,
307
+ "sts_pearson": 0.8030950418925904,
308
+ "retrieval_recall_at_1": 0.558,
309
+ "retrieval_recall_at_5": 0.85,
310
+ "retrieval_recall_at_10": 0.936,
311
+ "nli_accuracy": 0.5,
312
+ "nli_similarity": 0.7343195080757141,
313
+ "paraphrase_accuracy": 0.5,
314
+ "paraphrase_f1": 0.6666666666666666,
315
+ "paraphrase_similarity": 0.8582324385643005,
316
+ "composite_score": 0.7834484437242282
317
+ },
318
+ "losses": {
319
+ "sts": {
320
+ "total": 0.05065638410008472,
321
+ "distill": 0.0008207539668428185,
322
+ "task": 0.1789747496014056,
323
+ "count": 23
324
+ },
325
+ "retrieval": {
326
+ "total": 0.13189882991161753,
327
+ "distill": 0.0008978993072115044,
328
+ "task": 0.6230863763930949,
329
+ "count": 47
330
+ },
331
+ "nli": {
332
+ "total": 0.2552652054644646,
333
+ "distill": 0.0006161105479708218,
334
+ "task": 1.8111542945212507,
335
+ "count": 47
336
+ },
337
+ "paraphrase": {
338
+ "total": 0.03772428296506405,
339
+ "distill": 0.0008593377890065313,
340
+ "task": 0.5320944607257843,
341
+ "count": 10
342
+ }
343
+ },
344
+ "distill_weight": 0.2958
345
+ },
346
+ {
347
+ "epoch": 9,
348
+ "eval_results": {
349
+ "sts_spearman": 0.8224264234371386,
350
+ "sts_pearson": 0.801554516870732,
351
+ "retrieval_recall_at_1": 0.57,
352
+ "retrieval_recall_at_5": 0.86,
353
+ "retrieval_recall_at_10": 0.94,
354
+ "nli_accuracy": 0.5,
355
+ "nli_similarity": 0.733967125415802,
356
+ "paraphrase_accuracy": 0.5,
357
+ "paraphrase_f1": 0.6666666666666666,
358
+ "paraphrase_similarity": 0.8588850498199463,
359
+ "composite_score": 0.785879878385236
360
+ },
361
+ "losses": {
362
+ "sts": {
363
+ "total": 0.049955147601988006,
364
+ "distill": 0.0007517727663861992,
365
+ "task": 0.17640900611877441,
366
+ "count": 23
367
+ },
368
+ "retrieval": {
369
+ "total": 0.12419923918044314,
370
+ "distill": 0.0008363095508452426,
371
+ "task": 0.5862294866683635,
372
+ "count": 47
373
+ },
374
+ "nli": {
375
+ "total": 0.2503235790323704,
376
+ "distill": 0.0005653487615029704,
377
+ "task": 1.7746644527354138,
378
+ "count": 47
379
+ },
380
+ "paraphrase": {
381
+ "total": 0.035123253054916856,
382
+ "distill": 0.0008159744320437312,
383
+ "task": 0.49492592811584474,
384
+ "count": 10
385
+ }
386
+ },
387
+ "distill_weight": 0.2952
388
+ },
389
+ {
390
+ "epoch": 10,
391
+ "eval_results": {
392
+ "sts_spearman": 0.8213443657508973,
393
+ "sts_pearson": 0.8000685657254155,
394
+ "retrieval_recall_at_1": 0.574,
395
+ "retrieval_recall_at_5": 0.876,
396
+ "retrieval_recall_at_10": 0.942,
397
+ "nli_accuracy": 0.5,
398
+ "nli_similarity": 0.7336843609809875,
399
+ "paraphrase_accuracy": 0.5,
400
+ "paraphrase_f1": 0.6666666666666666,
401
+ "paraphrase_similarity": 0.8580488562583923,
402
+ "composite_score": 0.7901388495421153
403
+ },
404
+ "losses": {
405
+ "sts": {
406
+ "total": 0.050120412493529526,
407
+ "distill": 0.0007037564241529807,
408
+ "task": 0.1768963829330776,
409
+ "count": 23
410
+ },
411
+ "retrieval": {
412
+ "total": 0.11885773881952813,
413
+ "distill": 0.0007918142426283436,
414
+ "task": 0.5605541461325706,
415
+ "count": 47
416
+ },
417
+ "nli": {
418
+ "total": 0.24667784539943047,
419
+ "distill": 0.0005330673104370052,
420
+ "task": 1.7473829781755488,
421
+ "count": 47
422
+ },
423
+ "paraphrase": {
424
+ "total": 0.03359868098050356,
425
+ "distill": 0.0007893668138422072,
426
+ "task": 0.47301009893417356,
427
+ "count": 10
428
+ }
429
+ },
430
+ "distill_weight": 0.29460000000000003
431
+ },
432
+ {
433
+ "epoch": 11,
434
+ "eval_results": {
435
+ "sts_spearman": 0.8200398862029752,
436
+ "sts_pearson": 0.7984632150530062,
437
+ "retrieval_recall_at_1": 0.574,
438
+ "retrieval_recall_at_5": 0.88,
439
+ "retrieval_recall_at_10": 0.948,
440
+ "nli_accuracy": 0.5,
441
+ "nli_similarity": 0.7349022030830383,
442
+ "paraphrase_accuracy": 0.5,
443
+ "paraphrase_f1": 0.6666666666666666,
444
+ "paraphrase_similarity": 0.8587563037872314,
445
+ "composite_score": 0.7906866097681543
446
+ },
447
+ "losses": {
448
+ "sts": {
449
+ "total": 0.049540048222179,
450
+ "distill": 0.0006752971321870775,
451
+ "task": 0.17472205732179724,
452
+ "count": 23
453
+ },
454
+ "retrieval": {
455
+ "total": 0.1152741235304386,
456
+ "distill": 0.0007609698123873231,
457
+ "task": 0.543203037469945,
458
+ "count": 47
459
+ },
460
+ "nli": {
461
+ "total": 0.2454570157730833,
462
+ "distill": 0.000510894448218986,
463
+ "task": 1.7373002945108618,
464
+ "count": 47
465
+ },
466
+ "paraphrase": {
467
+ "total": 0.032351000048220155,
468
+ "distill": 0.0007691650826018304,
469
+ "task": 0.4550263941287994,
470
+ "count": 10
471
+ }
472
+ },
473
+ "distill_weight": 0.29400000000000004
474
+ },
475
+ {
476
+ "epoch": 12,
477
+ "eval_results": {
478
+ "sts_spearman": 0.8193103945963425,
479
+ "sts_pearson": 0.7974055277518916,
480
+ "retrieval_recall_at_1": 0.594,
481
+ "retrieval_recall_at_5": 0.884,
482
+ "retrieval_recall_at_10": 0.948,
483
+ "nli_accuracy": 0.5,
484
+ "nli_similarity": 0.7340207695960999,
485
+ "paraphrase_accuracy": 0.5,
486
+ "paraphrase_f1": 0.6666666666666666,
487
+ "paraphrase_similarity": 0.8586063981056213,
488
+ "composite_score": 0.7915218639648379
489
+ },
490
+ "losses": {
491
+ "sts": {
492
+ "total": 0.048518418131963066,
493
+ "distill": 0.0006525755262650225,
494
+ "task": 0.1709841172332349,
495
+ "count": 23
496
+ },
497
+ "retrieval": {
498
+ "total": 0.11415320951887901,
499
+ "distill": 0.0007357201898133659,
500
+ "task": 0.5374910330518763,
501
+ "count": 47
502
+ },
503
+ "nli": {
504
+ "total": 0.24104384951134947,
505
+ "distill": 0.0004927666555654179,
506
+ "task": 1.704636736119047,
507
+ "count": 47
508
+ },
509
+ "paraphrase": {
510
+ "total": 0.031325037218630315,
511
+ "distill": 0.0007606724102515727,
512
+ "task": 0.44016211330890653,
513
+ "count": 10
514
+ }
515
+ },
516
+ "distill_weight": 0.2934
517
+ },
518
+ {
519
+ "epoch": 13,
520
+ "eval_results": {
521
+ "sts_spearman": 0.8188842839819943,
522
+ "sts_pearson": 0.7964862034231038,
523
+ "retrieval_recall_at_1": 0.596,
524
+ "retrieval_recall_at_5": 0.896,
525
+ "retrieval_recall_at_10": 0.948,
526
+ "nli_accuracy": 0.5,
527
+ "nli_similarity": 0.7331050634384155,
528
+ "paraphrase_accuracy": 0.5,
529
+ "paraphrase_f1": 0.6666666666666666,
530
+ "paraphrase_similarity": 0.8598613142967224,
531
+ "composite_score": 0.7949088086576639
532
+ },
533
+ "losses": {
534
+ "sts": {
535
+ "total": 0.047971061390379204,
536
+ "distill": 0.000634684164137782,
537
+ "task": 0.1689240109661351,
538
+ "count": 23
539
+ },
540
+ "retrieval": {
541
+ "total": 0.109825795159695,
542
+ "distill": 0.000719507896103599,
543
+ "task": 0.5166625177606623,
544
+ "count": 47
545
+ },
546
+ "nli": {
547
+ "total": 0.23657965850322804,
548
+ "distill": 0.00048042560786385326,
549
+ "task": 1.6716556980254802,
550
+ "count": 47
551
+ },
552
+ "paraphrase": {
553
+ "total": 0.030759319849312305,
554
+ "distill": 0.0007484613277483731,
555
+ "task": 0.43184628784656526,
556
+ "count": 10
557
+ }
558
+ },
559
+ "distill_weight": 0.2928
560
+ },
561
+ {
562
+ "epoch": 14,
563
+ "eval_results": {
564
+ "sts_spearman": 0.8170877062297013,
565
+ "sts_pearson": 0.794295936858975,
566
+ "retrieval_recall_at_1": 0.596,
567
+ "retrieval_recall_at_5": 0.896,
568
+ "retrieval_recall_at_10": 0.948,
569
+ "nli_accuracy": 0.5,
570
+ "nli_similarity": 0.7348426580429077,
571
+ "paraphrase_accuracy": 0.5,
572
+ "paraphrase_f1": 0.6666666666666666,
573
+ "paraphrase_similarity": 0.8593454957008362,
574
+ "composite_score": 0.7940105197815173
575
+ },
576
+ "losses": {
577
+ "sts": {
578
+ "total": 0.04733733427913293,
579
+ "distill": 0.0006224965605803806,
580
+ "task": 0.16655636870342752,
581
+ "count": 23
582
+ },
583
+ "retrieval": {
584
+ "total": 0.10522193921373245,
585
+ "distill": 0.0007066282906886586,
586
+ "task": 0.49456279772393247,
587
+ "count": 47
588
+ },
589
+ "nli": {
590
+ "total": 0.23459308768840545,
591
+ "distill": 0.000471937776037908,
592
+ "task": 1.6562247961125476,
593
+ "count": 47
594
+ },
595
+ "paraphrase": {
596
+ "total": 0.02917077410966158,
597
+ "distill": 0.0007398281595669687,
598
+ "task": 0.40907877683639526,
599
+ "count": 10
600
+ }
601
+ },
602
+ "distill_weight": 0.2922
603
+ },
604
+ {
605
+ "epoch": 15,
606
+ "eval_results": {
607
+ "sts_spearman": 0.8157736839705673,
608
+ "sts_pearson": 0.7925540172374534,
609
+ "retrieval_recall_at_1": 0.606,
610
+ "retrieval_recall_at_5": 0.894,
611
+ "retrieval_recall_at_10": 0.952,
612
+ "nli_accuracy": 0.5,
613
+ "nli_similarity": 0.7331643104553223,
614
+ "paraphrase_accuracy": 0.5,
615
+ "paraphrase_f1": 0.6666666666666666,
616
+ "paraphrase_similarity": 0.8580483794212341,
617
+ "composite_score": 0.7927535086519504
618
+ },
619
+ "losses": {
620
+ "sts": {
621
+ "total": 0.04638928920030594,
622
+ "distill": 0.0006136388072501059,
623
+ "task": 0.16308000878147458,
624
+ "count": 23
625
+ },
626
+ "retrieval": {
627
+ "total": 0.10471683328456068,
628
+ "distill": 0.000696288561696147,
629
+ "task": 0.49178333104924954,
630
+ "count": 47
631
+ },
632
+ "nli": {
633
+ "total": 0.23013398304898688,
634
+ "distill": 0.00046382398540253174,
635
+ "task": 1.6233676443708704,
636
+ "count": 47
637
+ },
638
+ "paraphrase": {
639
+ "total": 0.028761257790029048,
640
+ "distill": 0.0007339846633840352,
641
+ "task": 0.4029817461967468,
642
+ "count": 10
643
+ }
644
+ },
645
+ "distill_weight": 0.2916
646
+ },
647
+ {
648
+ "epoch": 16,
649
+ "eval_results": {
650
+ "sts_spearman": 0.8157617853858435,
651
+ "sts_pearson": 0.7923932840541673,
652
+ "retrieval_recall_at_1": 0.612,
653
+ "retrieval_recall_at_5": 0.898,
654
+ "retrieval_recall_at_10": 0.954,
655
+ "nli_accuracy": 0.5,
656
+ "nli_similarity": 0.7322084903717041,
657
+ "paraphrase_accuracy": 0.5,
658
+ "paraphrase_f1": 0.6666666666666666,
659
+ "paraphrase_similarity": 0.8596447706222534,
660
+ "composite_score": 0.7939475593595884
661
+ },
662
+ "losses": {
663
+ "sts": {
664
+ "total": 0.04621984566683355,
665
+ "distill": 0.0006054732543618783,
666
+ "task": 0.1623542036699212,
667
+ "count": 23
668
+ },
669
+ "retrieval": {
670
+ "total": 0.10093737583845219,
671
+ "distill": 0.0006863788065024989,
672
+ "task": 0.473613737111396,
673
+ "count": 47
674
+ },
675
+ "nli": {
676
+ "total": 0.22668360205406837,
677
+ "distill": 0.00045654376067141904,
678
+ "task": 1.597678039936309,
679
+ "count": 47
680
+ },
681
+ "paraphrase": {
682
+ "total": 0.02780891638249159,
683
+ "distill": 0.0007290024077519774,
684
+ "task": 0.3892352104187012,
685
+ "count": 10
686
+ }
687
+ },
688
+ "distill_weight": 0.29100000000000004
689
+ },
690
+ {
691
+ "epoch": 17,
692
+ "eval_results": {
693
+ "sts_spearman": 0.8138228214142917,
694
+ "sts_pearson": 0.7899716079426576,
695
+ "retrieval_recall_at_1": 0.61,
696
+ "retrieval_recall_at_5": 0.9,
697
+ "retrieval_recall_at_10": 0.952,
698
+ "nli_accuracy": 0.5,
699
+ "nli_similarity": 0.7340738773345947,
700
+ "paraphrase_accuracy": 0.5,
701
+ "paraphrase_f1": 0.6666666666666666,
702
+ "paraphrase_similarity": 0.8595521450042725,
703
+ "composite_score": 0.7935780773738126
704
+ },
705
+ "losses": {
706
+ "sts": {
707
+ "total": 0.04601614598346793,
708
+ "distill": 0.0006001048604957759,
709
+ "task": 0.1615060425322989,
710
+ "count": 23
711
+ },
712
+ "retrieval": {
713
+ "total": 0.09758770988976702,
714
+ "distill": 0.0006810528612596557,
715
+ "task": 0.4574874652192948,
716
+ "count": 47
717
+ },
718
+ "nli": {
719
+ "total": 0.22612765399699516,
720
+ "distill": 0.00045324116202190203,
721
+ "task": 1.5924184956449143,
722
+ "count": 47
723
+ },
724
+ "paraphrase": {
725
+ "total": 0.02633261661976576,
726
+ "distill": 0.0007273759925737977,
727
+ "task": 0.3681142464280128,
728
+ "count": 10
729
+ }
730
+ },
731
+ "distill_weight": 0.2904
732
+ },
733
+ {
734
+ "epoch": 18,
735
+ "eval_results": {
736
+ "sts_spearman": 0.8137941541347218,
737
+ "sts_pearson": 0.7900114938692636,
738
+ "retrieval_recall_at_1": 0.618,
739
+ "retrieval_recall_at_5": 0.902,
740
+ "retrieval_recall_at_10": 0.954,
741
+ "nli_accuracy": 0.5,
742
+ "nli_similarity": 0.7320045232772827,
743
+ "paraphrase_accuracy": 0.5,
744
+ "paraphrase_f1": 0.6666666666666666,
745
+ "paraphrase_similarity": 0.8595419526100159,
746
+ "composite_score": 0.7941637437340276
747
+ },
748
+ "losses": {
749
+ "sts": {
750
+ "total": 0.04508868280960166,
751
+ "distill": 0.0005957435021865303,
752
+ "task": 0.15811052076194598,
753
+ "count": 23
754
+ },
755
+ "retrieval": {
756
+ "total": 0.09446902795040861,
757
+ "distill": 0.0006735043134540319,
758
+ "task": 0.4424755630340982,
759
+ "count": 47
760
+ },
761
+ "nli": {
762
+ "total": 0.2214733213186264,
763
+ "distill": 0.00044834198977580255,
764
+ "task": 1.5583173239484747,
765
+ "count": 47
766
+ },
767
+ "paraphrase": {
768
+ "total": 0.026047103106975555,
769
+ "distill": 0.0007218799262773245,
770
+ "task": 0.36381163746118544,
771
+ "count": 10
772
+ }
773
+ },
774
+ "distill_weight": 0.2898
775
+ },
776
+ {
777
+ "epoch": 19,
778
+ "eval_results": {
779
+ "sts_spearman": 0.8131324437136556,
780
+ "sts_pearson": 0.7888827667888834,
781
+ "retrieval_recall_at_1": 0.622,
782
+ "retrieval_recall_at_5": 0.906,
783
+ "retrieval_recall_at_10": 0.954,
784
+ "nli_accuracy": 0.5,
785
+ "nli_similarity": 0.7326275110244751,
786
+ "paraphrase_accuracy": 0.5,
787
+ "paraphrase_f1": 0.6666666666666666,
788
+ "paraphrase_similarity": 0.8593710064888,
789
+ "composite_score": 0.7950328885234945
790
+ },
791
+ "losses": {
792
+ "sts": {
793
+ "total": 0.044820788761843804,
794
+ "distill": 0.0005908587026288328,
795
+ "task": 0.1570410553527915,
796
+ "count": 23
797
+ },
798
+ "retrieval": {
799
+ "total": 0.0947175715514954,
800
+ "distill": 0.0006690982015843087,
801
+ "task": 0.44327550555797335,
802
+ "count": 47
803
+ },
804
+ "nli": {
805
+ "total": 0.22196958103078476,
806
+ "distill": 0.00044533125981886654,
807
+ "task": 1.5605007993414046,
808
+ "count": 47
809
+ },
810
+ "paraphrase": {
811
+ "total": 0.02539514433592558,
812
+ "distill": 0.000717665534466505,
813
+ "task": 0.35435559451580045,
814
+ "count": 10
815
+ }
816
+ },
817
+ "distill_weight": 0.2892
818
+ },
819
+ {
820
+ "epoch": 20,
821
+ "eval_results": {
822
+ "sts_spearman": 0.8125557315718881,
823
+ "sts_pearson": 0.7883415732948076,
824
+ "retrieval_recall_at_1": 0.622,
825
+ "retrieval_recall_at_5": 0.906,
826
+ "retrieval_recall_at_10": 0.956,
827
+ "nli_accuracy": 0.5,
828
+ "nli_similarity": 0.7300553321838379,
829
+ "paraphrase_accuracy": 0.5,
830
+ "paraphrase_f1": 0.6666666666666666,
831
+ "paraphrase_similarity": 0.8589670062065125,
832
+ "composite_score": 0.7947445324526108
833
+ },
834
+ "losses": {
835
+ "sts": {
836
+ "total": 0.04461472562473753,
837
+ "distill": 0.0005882442093697255,
838
+ "task": 0.15618835778340048,
839
+ "count": 23
840
+ },
841
+ "retrieval": {
842
+ "total": 0.09131274277225454,
843
+ "distill": 0.0006642562456111959,
844
+ "task": 0.4269564062991041,
845
+ "count": 47
846
+ },
847
+ "nli": {
848
+ "total": 0.21616570017439254,
849
+ "distill": 0.0004426293698030504,
850
+ "task": 1.518400065442349,
851
+ "count": 47
852
+ },
853
+ "paraphrase": {
854
+ "total": 0.024884730763733386,
855
+ "distill": 0.0007182472327258438,
856
+ "task": 0.3468856424093246,
857
+ "count": 10
858
+ }
859
+ },
860
+ "distill_weight": 0.2886
861
+ },
862
+ {
863
+ "epoch": 21,
864
+ "eval_results": {
865
+ "sts_spearman": 0.8124046923217468,
866
+ "sts_pearson": 0.7879715870379221,
867
+ "retrieval_recall_at_1": 0.626,
868
+ "retrieval_recall_at_5": 0.908,
869
+ "retrieval_recall_at_10": 0.954,
870
+ "nli_accuracy": 0.5,
871
+ "nli_similarity": 0.7299230694770813,
872
+ "paraphrase_accuracy": 0.5,
873
+ "paraphrase_f1": 0.6666666666666666,
874
+ "paraphrase_similarity": 0.8608139157295227,
875
+ "composite_score": 0.7952690128275401
876
+ },
877
+ "losses": {
878
+ "sts": {
879
+ "total": 0.044788526128167694,
880
+ "distill": 0.0005869252971656945,
881
+ "task": 0.15666956746059915,
882
+ "count": 23
883
+ },
884
+ "retrieval": {
885
+ "total": 0.09129251508002585,
886
+ "distill": 0.0006613465486728447,
887
+ "task": 0.42650772028781,
888
+ "count": 47
889
+ },
890
+ "nli": {
891
+ "total": 0.21487902199968378,
892
+ "distill": 0.0004392804477631332,
893
+ "task": 1.5080934889773105,
894
+ "count": 47
895
+ },
896
+ "paraphrase": {
897
+ "total": 0.02408289248123765,
898
+ "distill": 0.0007130690268240869,
899
+ "task": 0.3353585585951805,
900
+ "count": 10
901
+ }
902
+ },
903
+ "distill_weight": 0.28800000000000003
904
+ },
905
+ {
906
+ "epoch": 22,
907
+ "eval_results": {
908
+ "sts_spearman": 0.8108396163396415,
909
+ "sts_pearson": 0.7859428479306386,
910
+ "retrieval_recall_at_1": 0.626,
911
+ "retrieval_recall_at_5": 0.91,
912
+ "retrieval_recall_at_10": 0.956,
913
+ "nli_accuracy": 0.5,
914
+ "nli_similarity": 0.7304312586784363,
915
+ "paraphrase_accuracy": 0.5,
916
+ "paraphrase_f1": 0.6666666666666666,
917
+ "paraphrase_similarity": 0.8598558306694031,
918
+ "composite_score": 0.7950864748364874
919
+ },
920
+ "losses": {
921
+ "sts": {
922
+ "total": 0.04427984421667845,
923
+ "distill": 0.0005824335992498243,
924
+ "task": 0.15475882071515787,
925
+ "count": 23
926
+ },
927
+ "retrieval": {
928
+ "total": 0.08749476669633642,
929
+ "distill": 0.0006578407458406179,
930
+ "task": 0.40839041484163163,
931
+ "count": 47
932
+ },
933
+ "nli": {
934
+ "total": 0.21218161253218956,
935
+ "distill": 0.0004366559572171103,
936
+ "task": 1.487904317835544,
937
+ "count": 47
938
+ },
939
+ "paraphrase": {
940
+ "total": 0.02473886413499713,
941
+ "distill": 0.0007142506423406303,
942
+ "task": 0.34428276121616364,
943
+ "count": 10
944
+ }
945
+ },
946
+ "distill_weight": 0.2874
947
+ },
948
+ {
949
+ "epoch": 23,
950
+ "eval_results": {
951
+ "sts_spearman": 0.8101206106743032,
952
+ "sts_pearson": 0.7852421682051318,
953
+ "retrieval_recall_at_1": 0.626,
954
+ "retrieval_recall_at_5": 0.91,
955
+ "retrieval_recall_at_10": 0.956,
956
+ "nli_accuracy": 0.5,
957
+ "nli_similarity": 0.7298892140388489,
958
+ "paraphrase_accuracy": 0.5,
959
+ "paraphrase_f1": 0.6666666666666666,
960
+ "paraphrase_similarity": 0.8599591255187988,
961
+ "composite_score": 0.7947269720038184
962
+ },
963
+ "losses": {
964
+ "sts": {
965
+ "total": 0.043497207696023193,
966
+ "distill": 0.0005803209623200414,
967
+ "task": 0.15188857848229614,
968
+ "count": 23
969
+ },
970
+ "retrieval": {
971
+ "total": 0.08651064019253914,
972
+ "distill": 0.0006545566213119062,
973
+ "task": 0.40345349781056666,
974
+ "count": 47
975
+ },
976
+ "nli": {
977
+ "total": 0.21238021647676508,
978
+ "distill": 0.00043464487014794126,
979
+ "task": 1.4880508184432983,
980
+ "count": 47
981
+ },
982
+ "paraphrase": {
983
+ "total": 0.022220892272889613,
984
+ "distill": 0.0007110251928679645,
985
+ "task": 0.30870682895183565,
986
+ "count": 10
987
+ }
988
+ },
989
+ "distill_weight": 0.2868
990
+ },
991
+ {
992
+ "epoch": 24,
993
+ "eval_results": {
994
+ "sts_spearman": 0.8090374451655163,
995
+ "sts_pearson": 0.783875236477,
996
+ "retrieval_recall_at_1": 0.64,
997
+ "retrieval_recall_at_5": 0.91,
998
+ "retrieval_recall_at_10": 0.956,
999
+ "nli_accuracy": 0.5,
1000
+ "nli_similarity": 0.7307232618331909,
1001
+ "paraphrase_accuracy": 0.5,
1002
+ "paraphrase_f1": 0.6666666666666666,
1003
+ "paraphrase_similarity": 0.8604580163955688,
1004
+ "composite_score": 0.7941853892494248
1005
+ },
1006
+ "losses": {
1007
+ "sts": {
1008
+ "total": 0.04361056018134822,
1009
+ "distill": 0.0005793894398147645,
1010
+ "task": 0.15216006111839545,
1011
+ "count": 23
1012
+ },
1013
+ "retrieval": {
1014
+ "total": 0.08273381509996475,
1015
+ "distill": 0.0006529860733512868,
1016
+ "task": 0.3854811432513785,
1017
+ "count": 47
1018
+ },
1019
+ "nli": {
1020
+ "total": 0.2092740849611607,
1021
+ "distill": 0.00043218842759589724,
1022
+ "task": 1.4650490512239172,
1023
+ "count": 47
1024
+ },
1025
+ "paraphrase": {
1026
+ "total": 0.02230049455538392,
1027
+ "distill": 0.0007111194601748139,
1028
+ "task": 0.30956813097000124,
1029
+ "count": 10
1030
+ }
1031
+ },
1032
+ "distill_weight": 0.2862
1033
+ },
1034
+ {
1035
+ "epoch": 25,
1036
+ "eval_results": {
1037
+ "sts_spearman": 0.8088290083594672,
1038
+ "sts_pearson": 0.7836111268009822,
1039
+ "retrieval_recall_at_1": 0.642,
1040
+ "retrieval_recall_at_5": 0.912,
1041
+ "retrieval_recall_at_10": 0.958,
1042
+ "nli_accuracy": 0.5,
1043
+ "nli_similarity": 0.7291015386581421,
1044
+ "paraphrase_accuracy": 0.5,
1045
+ "paraphrase_f1": 0.6666666666666666,
1046
+ "paraphrase_similarity": 0.8604010939598083,
1047
+ "composite_score": 0.7946811708464003
1048
+ },
1049
+ "losses": {
1050
+ "sts": {
1051
+ "total": 0.042825429821791855,
1052
+ "distill": 0.0005763423598735877,
1053
+ "task": 0.14928901389889096,
1054
+ "count": 23
1055
+ },
1056
+ "retrieval": {
1057
+ "total": 0.08428762876924048,
1058
+ "distill": 0.0006495943922113548,
1059
+ "task": 0.3924136973441915,
1060
+ "count": 47
1061
+ },
1062
+ "nli": {
1063
+ "total": 0.20725905958642352,
1064
+ "distill": 0.0004320865307508552,
1065
+ "task": 1.4497177093587024,
1066
+ "count": 47
1067
+ },
1068
+ "paraphrase": {
1069
+ "total": 0.021814299654215573,
1070
+ "distill": 0.0007105717668309808,
1071
+ "task": 0.30251066088676454,
1072
+ "count": 10
1073
+ }
1074
+ },
1075
+ "distill_weight": 0.2856
1076
+ },
1077
+ {
1078
+ "epoch": 26,
1079
+ "eval_results": {
1080
+ "sts_spearman": 0.8086444673068466,
1081
+ "sts_pearson": 0.7835424818834511,
1082
+ "retrieval_recall_at_1": 0.65,
1083
+ "retrieval_recall_at_5": 0.914,
1084
+ "retrieval_recall_at_10": 0.96,
1085
+ "nli_accuracy": 0.5,
1086
+ "nli_similarity": 0.7279387712478638,
1087
+ "paraphrase_accuracy": 0.5,
1088
+ "paraphrase_f1": 0.6666666666666666,
1089
+ "paraphrase_similarity": 0.8617146015167236,
1090
+ "composite_score": 0.79518890032009
1091
+ },
1092
+ "losses": {
1093
+ "sts": {
1094
+ "total": 0.04305816587546597,
1095
+ "distill": 0.0005753715294818192,
1096
+ "task": 0.14997966134029886,
1097
+ "count": 23
1098
+ },
1099
+ "retrieval": {
1100
+ "total": 0.08245786469667515,
1101
+ "distill": 0.0006471946550671883,
1102
+ "task": 0.3835590494439957,
1103
+ "count": 47
1104
+ },
1105
+ "nli": {
1106
+ "total": 0.20547382248208879,
1107
+ "distill": 0.00042902108806958225,
1108
+ "task": 1.4360247748963377,
1109
+ "count": 47
1110
+ },
1111
+ "paraphrase": {
1112
+ "total": 0.02213638899847865,
1113
+ "distill": 0.0007105463359039277,
1114
+ "task": 0.3067675843834877,
1115
+ "count": 10
1116
+ }
1117
+ },
1118
+ "distill_weight": 0.28500000000000003
1119
+ },
1120
+ {
1121
+ "epoch": 27,
1122
+ "eval_results": {
1123
+ "sts_spearman": 0.8075945769025906,
1124
+ "sts_pearson": 0.7817250896683322,
1125
+ "retrieval_recall_at_1": 0.648,
1126
+ "retrieval_recall_at_5": 0.916,
1127
+ "retrieval_recall_at_10": 0.964,
1128
+ "nli_accuracy": 0.5,
1129
+ "nli_similarity": 0.7290590405464172,
1130
+ "paraphrase_accuracy": 0.5,
1131
+ "paraphrase_f1": 0.6666666666666666,
1132
+ "paraphrase_similarity": 0.8608205318450928,
1133
+ "composite_score": 0.795263955117962
1134
+ },
1135
+ "losses": {
1136
+ "sts": {
1137
+ "total": 0.041979301558888474,
1138
+ "distill": 0.0005742409064070038,
1139
+ "task": 0.1460871498869813,
1140
+ "count": 23
1141
+ },
1142
+ "retrieval": {
1143
+ "total": 0.07854667179127957,
1144
+ "distill": 0.0006465072064918089,
1145
+ "task": 0.3650214504054252,
1146
+ "count": 47
1147
+ },
1148
+ "nli": {
1149
+ "total": 0.20393373991580718,
1150
+ "distill": 0.00042719431060485227,
1151
+ "task": 1.4240653768498848,
1152
+ "count": 47
1153
+ },
1154
+ "paraphrase": {
1155
+ "total": 0.02039573285728693,
1156
+ "distill": 0.0007079597911797464,
1157
+ "task": 0.28220218420028687,
1158
+ "count": 10
1159
+ }
1160
+ },
1161
+ "distill_weight": 0.2844
1162
+ },
1163
+ {
1164
+ "epoch": 28,
1165
+ "eval_results": {
1166
+ "sts_spearman": 0.8072111574266839,
1167
+ "sts_pearson": 0.7814816033631548,
1168
+ "retrieval_recall_at_1": 0.656,
1169
+ "retrieval_recall_at_5": 0.916,
1170
+ "retrieval_recall_at_10": 0.962,
1171
+ "nli_accuracy": 0.5,
1172
+ "nli_similarity": 0.7277703881263733,
1173
+ "paraphrase_accuracy": 0.5,
1174
+ "paraphrase_f1": 0.6666666666666666,
1175
+ "paraphrase_similarity": 0.8611229062080383,
1176
+ "composite_score": 0.7950722453800086
1177
+ },
1178
+ "losses": {
1179
+ "sts": {
1180
+ "total": 0.04250466094716736,
1181
+ "distill": 0.0005732332761196986,
1182
+ "task": 0.14780080836752188,
1183
+ "count": 23
1184
+ },
1185
+ "retrieval": {
1186
+ "total": 0.07828945341579457,
1187
+ "distill": 0.0006427260170234962,
1188
+ "task": 0.363525298681665,
1189
+ "count": 47
1190
+ },
1191
+ "nli": {
1192
+ "total": 0.20262091527593898,
1193
+ "distill": 0.0004252295209184051,
1194
+ "task": 1.4137128566173798,
1195
+ "count": 47
1196
+ },
1197
+ "paraphrase": {
1198
+ "total": 0.021627166587859393,
1199
+ "distill": 0.000706591084599495,
1200
+ "task": 0.29917111396789553,
1201
+ "count": 10
1202
+ }
1203
+ },
1204
+ "distill_weight": 0.2838
1205
+ },
1206
+ {
1207
+ "epoch": 29,
1208
+ "eval_results": {
1209
+ "sts_spearman": 0.8061150812196531,
1210
+ "sts_pearson": 0.7799263057126831,
1211
+ "retrieval_recall_at_1": 0.654,
1212
+ "retrieval_recall_at_5": 0.918,
1213
+ "retrieval_recall_at_10": 0.964,
1214
+ "nli_accuracy": 0.5,
1215
+ "nli_similarity": 0.7274843454360962,
1216
+ "paraphrase_accuracy": 0.5,
1217
+ "paraphrase_f1": 0.6666666666666666,
1218
+ "paraphrase_similarity": 0.8605352640151978,
1219
+ "composite_score": 0.7951242072764932
1220
+ },
1221
+ "losses": {
1222
+ "sts": {
1223
+ "total": 0.04178553026007569,
1224
+ "distill": 0.0005705209104749172,
1225
+ "task": 0.14517284346663434,
1226
+ "count": 23
1227
+ },
1228
+ "retrieval": {
1229
+ "total": 0.07806510272178244,
1230
+ "distill": 0.0006415697936701806,
1231
+ "task": 0.3621810408348733,
1232
+ "count": 47
1233
+ },
1234
+ "nli": {
1235
+ "total": 0.20029219636257659,
1236
+ "distill": 0.00042426467754263825,
1237
+ "task": 1.3962893257749842,
1238
+ "count": 47
1239
+ },
1240
+ "paraphrase": {
1241
+ "total": 0.020195775851607322,
1242
+ "distill": 0.000706263561733067,
1243
+ "task": 0.2789587274193764,
1244
+ "count": 10
1245
+ }
1246
+ },
1247
+ "distill_weight": 0.2832
1248
+ },
1249
+ {
1250
+ "epoch": 30,
1251
+ "eval_results": {
1252
+ "sts_spearman": 0.805952102459493,
1253
+ "sts_pearson": 0.7796381825168456,
1254
+ "retrieval_recall_at_1": 0.658,
1255
+ "retrieval_recall_at_5": 0.92,
1256
+ "retrieval_recall_at_10": 0.966,
1257
+ "nli_accuracy": 0.5,
1258
+ "nli_similarity": 0.7271153926849365,
1259
+ "paraphrase_accuracy": 0.5,
1260
+ "paraphrase_f1": 0.6666666666666666,
1261
+ "paraphrase_similarity": 0.8611077666282654,
1262
+ "composite_score": 0.7956427178964132
1263
+ },
1264
+ "losses": {
1265
+ "sts": {
1266
+ "total": 0.041891162486180016,
1267
+ "distill": 0.0005706739384154587,
1268
+ "task": 0.14542057948267978,
1269
+ "count": 23
1270
+ },
1271
+ "retrieval": {
1272
+ "total": 0.07668351699063118,
1273
+ "distill": 0.0006396458459463208,
1274
+ "task": 0.35546302129613594,
1275
+ "count": 47
1276
+ },
1277
+ "nli": {
1278
+ "total": 0.19826114177703857,
1279
+ "distill": 0.00042267177598253686,
1280
+ "task": 1.3809708052493157,
1281
+ "count": 47
1282
+ },
1283
+ "paraphrase": {
1284
+ "total": 0.019489498622715474,
1285
+ "distill": 0.0007044955214951188,
1286
+ "task": 0.268893338739872,
1287
+ "count": 10
1288
+ }
1289
+ },
1290
+ "distill_weight": 0.2826
1291
+ },
1292
+ {
1293
+ "epoch": 31,
1294
+ "eval_results": {
1295
+ "sts_spearman": 0.8054395024616142,
1296
+ "sts_pearson": 0.7790870358568797,
1297
+ "retrieval_recall_at_1": 0.662,
1298
+ "retrieval_recall_at_5": 0.92,
1299
+ "retrieval_recall_at_10": 0.968,
1300
+ "nli_accuracy": 0.5,
1301
+ "nli_similarity": 0.7270572781562805,
1302
+ "paraphrase_accuracy": 0.5,
1303
+ "paraphrase_f1": 0.6666666666666666,
1304
+ "paraphrase_similarity": 0.8614012002944946,
1305
+ "composite_score": 0.7953864178974739
1306
+ },
1307
+ "losses": {
1308
+ "sts": {
1309
+ "total": 0.04180093708893527,
1310
+ "distill": 0.000569236811513648,
1311
+ "task": 0.14498750833065613,
1312
+ "count": 23
1313
+ },
1314
+ "retrieval": {
1315
+ "total": 0.07472586061092133,
1316
+ "distill": 0.0006379124688658308,
1317
+ "task": 0.346081572644254,
1318
+ "count": 47
1319
+ },
1320
+ "nli": {
1321
+ "total": 0.19744586469011105,
1322
+ "distill": 0.0004216810487745766,
1323
+ "task": 1.3741430931902947,
1324
+ "count": 47
1325
+ },
1326
+ "paraphrase": {
1327
+ "total": 0.02020758679136634,
1328
+ "distill": 0.0007039231946691871,
1329
+ "task": 0.27867799401283266,
1330
+ "count": 10
1331
+ }
1332
+ },
1333
+ "distill_weight": 0.28200000000000003
1334
+ },
1335
+ {
1336
+ "epoch": 32,
1337
+ "eval_results": {
1338
+ "sts_spearman": 0.8039734694214193,
1339
+ "sts_pearson": 0.7774542134736094,
1340
+ "retrieval_recall_at_1": 0.658,
1341
+ "retrieval_recall_at_5": 0.922,
1342
+ "retrieval_recall_at_10": 0.972,
1343
+ "nli_accuracy": 0.5,
1344
+ "nli_similarity": 0.7258086204528809,
1345
+ "paraphrase_accuracy": 0.5,
1346
+ "paraphrase_f1": 0.6666666666666666,
1347
+ "paraphrase_similarity": 0.8613040447235107,
1348
+ "composite_score": 0.7952534013773764
1349
+ },
1350
+ "losses": {
1351
+ "sts": {
1352
+ "total": 0.04136973813824032,
1353
+ "distill": 0.000568263144131102,
1354
+ "task": 0.14336845570284387,
1355
+ "count": 23
1356
+ },
1357
+ "retrieval": {
1358
+ "total": 0.07349338850125353,
1359
+ "distill": 0.000635797770921775,
1360
+ "task": 0.3400801315586618,
1361
+ "count": 47
1362
+ },
1363
+ "nli": {
1364
+ "total": 0.195538673629152,
1365
+ "distill": 0.00042041142513242333,
1366
+ "task": 1.3597298099639568,
1367
+ "count": 47
1368
+ },
1369
+ "paraphrase": {
1370
+ "total": 0.01802230104804039,
1371
+ "distill": 0.0007017374911811203,
1372
+ "task": 0.24804942756891252,
1373
+ "count": 10
1374
+ }
1375
+ },
1376
+ "distill_weight": 0.2814
1377
+ },
1378
+ {
1379
+ "epoch": 33,
1380
+ "eval_results": {
1381
+ "sts_spearman": 0.8034526879232323,
1382
+ "sts_pearson": 0.7767872504658838,
1383
+ "retrieval_recall_at_1": 0.664,
1384
+ "retrieval_recall_at_5": 0.92,
1385
+ "retrieval_recall_at_10": 0.972,
1386
+ "nli_accuracy": 0.5,
1387
+ "nli_similarity": 0.7267512083053589,
1388
+ "paraphrase_accuracy": 0.5,
1389
+ "paraphrase_f1": 0.6666666666666666,
1390
+ "paraphrase_similarity": 0.8609178066253662,
1391
+ "composite_score": 0.7943930106282828
1392
+ },
1393
+ "losses": {
1394
+ "sts": {
1395
+ "total": 0.04124606368334397,
1396
+ "distill": 0.0005672681145370007,
1397
+ "task": 0.14282110203867374,
1398
+ "count": 23
1399
+ },
1400
+ "retrieval": {
1401
+ "total": 0.07269588295132556,
1402
+ "distill": 0.0006344794367558937,
1403
+ "task": 0.3361036536541391,
1404
+ "count": 47
1405
+ },
1406
+ "nli": {
1407
+ "total": 0.19442298532800473,
1408
+ "distill": 0.0004187757124569505,
1409
+ "task": 1.3508439393753702,
1410
+ "count": 47
1411
+ },
1412
+ "paraphrase": {
1413
+ "total": 0.01792764011770487,
1414
+ "distill": 0.0007010513567365706,
1415
+ "task": 0.246534825861454,
1416
+ "count": 10
1417
+ }
1418
+ },
1419
+ "distill_weight": 0.2808
1420
+ },
1421
+ {
1422
+ "epoch": 34,
1423
+ "eval_results": {
1424
+ "sts_spearman": 0.8029147605057396,
1425
+ "sts_pearson": 0.7758699525760386,
1426
+ "retrieval_recall_at_1": 0.656,
1427
+ "retrieval_recall_at_5": 0.924,
1428
+ "retrieval_recall_at_10": 0.976,
1429
+ "nli_accuracy": 0.5,
1430
+ "nli_similarity": 0.7263163924217224,
1431
+ "paraphrase_accuracy": 0.5,
1432
+ "paraphrase_f1": 0.6666666666666666,
1433
+ "paraphrase_similarity": 0.861183762550354,
1434
+ "composite_score": 0.7953240469195365
1435
+ },
1436
+ "losses": {
1437
+ "sts": {
1438
+ "total": 0.04087469354271889,
1439
+ "distill": 0.0005659387705078268,
1440
+ "task": 0.14141468995291254,
1441
+ "count": 23
1442
+ },
1443
+ "retrieval": {
1444
+ "total": 0.07193009151106185,
1445
+ "distill": 0.0006315128511174562,
1446
+ "task": 0.3322827714554807,
1447
+ "count": 47
1448
+ },
1449
+ "nli": {
1450
+ "total": 0.19458080923303644,
1451
+ "distill": 0.0004175955335550169,
1452
+ "task": 1.3508182814780703,
1453
+ "count": 47
1454
+ },
1455
+ "paraphrase": {
1456
+ "total": 0.018750363681465387,
1457
+ "distill": 0.0007004878658335656,
1458
+ "task": 0.2577672630548477,
1459
+ "count": 10
1460
+ }
1461
+ },
1462
+ "distill_weight": 0.2802
1463
+ },
1464
+ {
1465
+ "epoch": 35,
1466
+ "eval_results": {
1467
+ "sts_spearman": 0.8029120487600959,
1468
+ "sts_pearson": 0.7759091259908452,
1469
+ "retrieval_recall_at_1": 0.652,
1470
+ "retrieval_recall_at_5": 0.924,
1471
+ "retrieval_recall_at_10": 0.976,
1472
+ "nli_accuracy": 0.5,
1473
+ "nli_similarity": 0.7263759970664978,
1474
+ "paraphrase_accuracy": 0.5,
1475
+ "paraphrase_f1": 0.6666666666666666,
1476
+ "paraphrase_similarity": 0.8618313670158386,
1477
+ "composite_score": 0.7953226910467146
1478
+ },
1479
+ "losses": {
1480
+ "sts": {
1481
+ "total": 0.040802220611468605,
1482
+ "distill": 0.0005649596442589941,
1483
+ "task": 0.1410475363549979,
1484
+ "count": 23
1485
+ },
1486
+ "retrieval": {
1487
+ "total": 0.07252203149998442,
1488
+ "distill": 0.0006298948205670619,
1489
+ "task": 0.33474880393515244,
1490
+ "count": 47
1491
+ },
1492
+ "nli": {
1493
+ "total": 0.19197505680804558,
1494
+ "distill": 0.0004152855129932311,
1495
+ "task": 1.3316139946592616,
1496
+ "count": 47
1497
+ },
1498
+ "paraphrase": {
1499
+ "total": 0.01693342700600624,
1500
+ "distill": 0.0006995417177677154,
1501
+ "task": 0.23234085589647294,
1502
+ "count": 10
1503
+ }
1504
+ },
1505
+ "distill_weight": 0.2796
1506
+ },
1507
+ {
1508
+ "epoch": 36,
1509
+ "eval_results": {
1510
+ "sts_spearman": 0.802407902504706,
1511
+ "sts_pearson": 0.7756569556734015,
1512
+ "retrieval_recall_at_1": 0.658,
1513
+ "retrieval_recall_at_5": 0.924,
1514
+ "retrieval_recall_at_10": 0.974,
1515
+ "nli_accuracy": 0.5,
1516
+ "nli_similarity": 0.7243511080741882,
1517
+ "paraphrase_accuracy": 0.5,
1518
+ "paraphrase_f1": 0.6666666666666666,
1519
+ "paraphrase_similarity": 0.8609747290611267,
1520
+ "composite_score": 0.7950706179190197
1521
+ },
1522
+ "losses": {
1523
+ "sts": {
1524
+ "total": 0.040637732200000595,
1525
+ "distill": 0.0005651830908154016,
1526
+ "task": 0.14036077325758728,
1527
+ "count": 23
1528
+ },
1529
+ "retrieval": {
1530
+ "total": 0.07090983135586089,
1531
+ "distill": 0.0006288531424596588,
1532
+ "task": 0.32701979894587335,
1533
+ "count": 47
1534
+ },
1535
+ "nli": {
1536
+ "total": 0.1879324754501911,
1537
+ "distill": 0.0004152300212770066,
1538
+ "task": 1.3024731620829155,
1539
+ "count": 47
1540
+ },
1541
+ "paraphrase": {
1542
+ "total": 0.01749492483213544,
1543
+ "distill": 0.0006997698335908353,
1544
+ "task": 0.23994021117687225,
1545
+ "count": 10
1546
+ }
1547
+ },
1548
+ "distill_weight": 0.279
1549
+ },
1550
+ {
1551
+ "epoch": 37,
1552
+ "eval_results": {
1553
+ "sts_spearman": 0.8014589498925283,
1554
+ "sts_pearson": 0.7745302218047226,
1555
+ "retrieval_recall_at_1": 0.662,
1556
+ "retrieval_recall_at_5": 0.926,
1557
+ "retrieval_recall_at_10": 0.974,
1558
+ "nli_accuracy": 0.5,
1559
+ "nli_similarity": 0.7247843742370605,
1560
+ "paraphrase_accuracy": 0.5,
1561
+ "paraphrase_f1": 0.6666666666666666,
1562
+ "paraphrase_similarity": 0.8621292114257812,
1563
+ "composite_score": 0.7951961416129308
1564
+ },
1565
+ "losses": {
1566
+ "sts": {
1567
+ "total": 0.0404453170688256,
1568
+ "distill": 0.0005647512937329062,
1569
+ "task": 0.13957903113054193,
1570
+ "count": 23
1571
+ },
1572
+ "retrieval": {
1573
+ "total": 0.07041878134329269,
1574
+ "distill": 0.000626888732980699,
1575
+ "task": 0.3244838112212242,
1576
+ "count": 47
1577
+ },
1578
+ "nli": {
1579
+ "total": 0.18824834677767247,
1580
+ "distill": 0.00041336335856071175,
1581
+ "task": 1.3035842015388164,
1582
+ "count": 47
1583
+ },
1584
+ "paraphrase": {
1585
+ "total": 0.017420530039817094,
1586
+ "distill": 0.0006980633072089404,
1587
+ "task": 0.23872213810682297,
1588
+ "count": 10
1589
+ }
1590
+ },
1591
+ "distill_weight": 0.2784
1592
+ },
1593
+ {
1594
+ "epoch": 38,
1595
+ "eval_results": {
1596
+ "sts_spearman": 0.8015481972850798,
1597
+ "sts_pearson": 0.7748134816243155,
1598
+ "retrieval_recall_at_1": 0.662,
1599
+ "retrieval_recall_at_5": 0.926,
1600
+ "retrieval_recall_at_10": 0.974,
1601
+ "nli_accuracy": 0.5,
1602
+ "nli_similarity": 0.7238280177116394,
1603
+ "paraphrase_accuracy": 0.5,
1604
+ "paraphrase_f1": 0.6666666666666666,
1605
+ "paraphrase_similarity": 0.8617278337478638,
1606
+ "composite_score": 0.7952407653092066
1607
+ },
1608
+ "losses": {
1609
+ "sts": {
1610
+ "total": 0.040147524246055145,
1611
+ "distill": 0.000563749264034888,
1612
+ "task": 0.13843435340601465,
1613
+ "count": 23
1614
+ },
1615
+ "retrieval": {
1616
+ "total": 0.06983992354349887,
1617
+ "distill": 0.0006249398723779682,
1618
+ "task": 0.3215467447930194,
1619
+ "count": 47
1620
+ },
1621
+ "nli": {
1622
+ "total": 0.18727405749736947,
1623
+ "distill": 0.0004120995005731411,
1624
+ "task": 1.2957600558057745,
1625
+ "count": 47
1626
+ },
1627
+ "paraphrase": {
1628
+ "total": 0.016873036976903677,
1629
+ "distill": 0.0006976060452871024,
1630
+ "task": 0.23095046430826188,
1631
+ "count": 10
1632
+ }
1633
+ },
1634
+ "distill_weight": 0.2778
1635
+ },
1636
+ {
1637
+ "epoch": 39,
1638
+ "eval_results": {
1639
+ "sts_spearman": 0.8010951080044263,
1640
+ "sts_pearson": 0.7744044551086896,
1641
+ "retrieval_recall_at_1": 0.66,
1642
+ "retrieval_recall_at_5": 0.926,
1643
+ "retrieval_recall_at_10": 0.976,
1644
+ "nli_accuracy": 0.5,
1645
+ "nli_similarity": 0.7240303158760071,
1646
+ "paraphrase_accuracy": 0.5,
1647
+ "paraphrase_f1": 0.6666666666666666,
1648
+ "paraphrase_similarity": 0.8622973561286926,
1649
+ "composite_score": 0.7950142206688798
1650
+ },
1651
+ "losses": {
1652
+ "sts": {
1653
+ "total": 0.040290426786827004,
1654
+ "distill": 0.0005619644627744412,
1655
+ "task": 0.1388165892466255,
1656
+ "count": 23
1657
+ },
1658
+ "retrieval": {
1659
+ "total": 0.06733930118857546,
1660
+ "distill": 0.0006235166393378948,
1661
+ "task": 0.3097512569833309,
1662
+ "count": 47
1663
+ },
1664
+ "nli": {
1665
+ "total": 0.18707014081325937,
1666
+ "distill": 0.00041061582445028297,
1667
+ "task": 1.29327839993416,
1668
+ "count": 47
1669
+ },
1670
+ "paraphrase": {
1671
+ "total": 0.017511206772178413,
1672
+ "distill": 0.0006984172330703586,
1673
+ "task": 0.23959056437015533,
1674
+ "count": 10
1675
+ }
1676
+ },
1677
+ "distill_weight": 0.2772
1678
+ },
1679
+ {
1680
+ "epoch": 40,
1681
+ "eval_results": {
1682
+ "sts_spearman": 0.8008200971384819,
1683
+ "sts_pearson": 0.774130829271817,
1684
+ "retrieval_recall_at_1": 0.666,
1685
+ "retrieval_recall_at_5": 0.932,
1686
+ "retrieval_recall_at_10": 0.974,
1687
+ "nli_accuracy": 0.5,
1688
+ "nli_similarity": 0.7223304510116577,
1689
+ "paraphrase_accuracy": 0.5,
1690
+ "paraphrase_f1": 0.6666666666666666,
1691
+ "paraphrase_similarity": 0.862885594367981,
1692
+ "composite_score": 0.7966767152359077
1693
+ },
1694
+ "losses": {
1695
+ "sts": {
1696
+ "total": 0.039851526122378265,
1697
+ "distill": 0.0005616477422375718,
1698
+ "task": 0.1371861229772153,
1699
+ "count": 23
1700
+ },
1701
+ "retrieval": {
1702
+ "total": 0.0681393450879036,
1703
+ "distill": 0.0006218473711843661,
1704
+ "task": 0.3131846848954546,
1705
+ "count": 47
1706
+ },
1707
+ "nli": {
1708
+ "total": 0.18474397950984062,
1709
+ "distill": 0.00041012915824738114,
1710
+ "task": 1.2761304099509057,
1711
+ "count": 47
1712
+ },
1713
+ "paraphrase": {
1714
+ "total": 0.01582443034276366,
1715
+ "distill": 0.0006965516076888889,
1716
+ "task": 0.2160874292254448,
1717
+ "count": 10
1718
+ }
1719
+ },
1720
+ "distill_weight": 0.2766
1721
+ },
1722
+ {
1723
+ "epoch": 41,
1724
+ "eval_results": {
1725
+ "sts_spearman": 0.8004397143136981,
1726
+ "sts_pearson": 0.7737034113801735,
1727
+ "retrieval_recall_at_1": 0.662,
1728
+ "retrieval_recall_at_5": 0.93,
1729
+ "retrieval_recall_at_10": 0.976,
1730
+ "nli_accuracy": 0.5,
1731
+ "nli_similarity": 0.7216349840164185,
1732
+ "paraphrase_accuracy": 0.5,
1733
+ "paraphrase_f1": 0.6666666666666666,
1734
+ "paraphrase_similarity": 0.8629198670387268,
1735
+ "composite_score": 0.7958865238235158
1736
+ },
1737
+ "losses": {
1738
+ "sts": {
1739
+ "total": 0.040168939077335854,
1740
+ "distill": 0.0005606161886016312,
1741
+ "task": 0.1381706041486367,
1742
+ "count": 23
1743
+ },
1744
+ "retrieval": {
1745
+ "total": 0.06701801083189377,
1746
+ "distill": 0.0006201347891003527,
1747
+ "task": 0.3077663652440335,
1748
+ "count": 47
1749
+ },
1750
+ "nli": {
1751
+ "total": 0.18171446183894543,
1752
+ "distill": 0.00040893261246581345,
1753
+ "task": 1.254154608604756,
1754
+ "count": 47
1755
+ },
1756
+ "paraphrase": {
1757
+ "total": 0.016577236633747817,
1758
+ "distill": 0.0006964895350392908,
1759
+ "task": 0.2263122245669365,
1760
+ "count": 10
1761
+ }
1762
+ },
1763
+ "distill_weight": 0.276
1764
+ },
1765
+ {
1766
+ "epoch": 42,
1767
+ "eval_results": {
1768
+ "sts_spearman": 0.799541139671189,
1769
+ "sts_pearson": 0.7726619557363898,
1770
+ "retrieval_recall_at_1": 0.664,
1771
+ "retrieval_recall_at_5": 0.932,
1772
+ "retrieval_recall_at_10": 0.976,
1773
+ "nli_accuracy": 0.5,
1774
+ "nli_similarity": 0.7218517065048218,
1775
+ "paraphrase_accuracy": 0.5,
1776
+ "paraphrase_f1": 0.6666666666666666,
1777
+ "paraphrase_similarity": 0.862092912197113,
1778
+ "composite_score": 0.7960372365022612
1779
+ },
1780
+ "losses": {
1781
+ "sts": {
1782
+ "total": 0.039756916949282524,
1783
+ "distill": 0.0005605145656179799,
1784
+ "task": 0.13663589921982391,
1785
+ "count": 23
1786
+ },
1787
+ "retrieval": {
1788
+ "total": 0.06787618566700752,
1789
+ "distill": 0.0006185190620871776,
1790
+ "task": 0.3114630852607971,
1791
+ "count": 47
1792
+ },
1793
+ "nli": {
1794
+ "total": 0.1818689741986863,
1795
+ "distill": 0.0004076707292041008,
1796
+ "task": 1.254186414657755,
1797
+ "count": 47
1798
+ },
1799
+ "paraphrase": {
1800
+ "total": 0.015548030100762843,
1801
+ "distill": 0.0006947330140974373,
1802
+ "task": 0.21193347945809365,
1803
+ "count": 10
1804
+ }
1805
+ },
1806
+ "distill_weight": 0.2754
1807
+ },
1808
+ {
1809
+ "epoch": 43,
1810
+ "eval_results": {
1811
+ "sts_spearman": 0.798959430373989,
1812
+ "sts_pearson": 0.7717844926047425,
1813
+ "retrieval_recall_at_1": 0.658,
1814
+ "retrieval_recall_at_5": 0.934,
1815
+ "retrieval_recall_at_10": 0.976,
1816
+ "nli_accuracy": 0.5,
1817
+ "nli_similarity": 0.7230818867683411,
1818
+ "paraphrase_accuracy": 0.5,
1819
+ "paraphrase_f1": 0.6666666666666666,
1820
+ "paraphrase_similarity": 0.8633172512054443,
1821
+ "composite_score": 0.7963463818536611
1822
+ },
1823
+ "losses": {
1824
+ "sts": {
1825
+ "total": 0.03980643334596053,
1826
+ "distill": 0.0005611540931884361,
1827
+ "task": 0.13669410898633624,
1828
+ "count": 23
1829
+ },
1830
+ "retrieval": {
1831
+ "total": 0.0670052819905129,
1832
+ "distill": 0.0006171439581134535,
1833
+ "task": 0.3072057971928982,
1834
+ "count": 47
1835
+ },
1836
+ "nli": {
1837
+ "total": 0.181170218168421,
1838
+ "distill": 0.0004060288384665755,
1839
+ "task": 1.2483358712906534,
1840
+ "count": 47
1841
+ },
1842
+ "paraphrase": {
1843
+ "total": 0.016034624353051186,
1844
+ "distill": 0.0006926223228219897,
1845
+ "task": 0.21848167926073075,
1846
+ "count": 10
1847
+ }
1848
+ },
1849
+ "distill_weight": 0.2748
1850
+ },
1851
+ {
1852
+ "epoch": 44,
1853
+ "eval_results": {
1854
+ "sts_spearman": 0.7992766546129851,
1855
+ "sts_pearson": 0.7723631373159269,
1856
+ "retrieval_recall_at_1": 0.668,
1857
+ "retrieval_recall_at_5": 0.936,
1858
+ "retrieval_recall_at_10": 0.976,
1859
+ "nli_accuracy": 0.5,
1860
+ "nli_similarity": 0.7214555740356445,
1861
+ "paraphrase_accuracy": 0.5,
1862
+ "paraphrase_f1": 0.6666666666666666,
1863
+ "paraphrase_similarity": 0.8633242845535278,
1864
+ "composite_score": 0.7971049939731593
1865
+ },
1866
+ "losses": {
1867
+ "sts": {
1868
+ "total": 0.03956834745147954,
1869
+ "distill": 0.0005598463517937647,
1870
+ "task": 0.13576342230257782,
1871
+ "count": 23
1872
+ },
1873
+ "retrieval": {
1874
+ "total": 0.0675140564587522,
1875
+ "distill": 0.0006155266111934597,
1876
+ "task": 0.30929218010699494,
1877
+ "count": 47
1878
+ },
1879
+ "nli": {
1880
+ "total": 0.18059916984527669,
1881
+ "distill": 0.00040597010755594426,
1882
+ "task": 1.2433717808824905,
1883
+ "count": 47
1884
+ },
1885
+ "paraphrase": {
1886
+ "total": 0.016109541151672603,
1887
+ "distill": 0.000693465251242742,
1888
+ "task": 0.21933580189943314,
1889
+ "count": 10
1890
+ }
1891
+ },
1892
+ "distill_weight": 0.2742
1893
+ },
1894
+ {
1895
+ "epoch": 45,
1896
+ "eval_results": {
1897
+ "sts_spearman": 0.7987902718942615,
1898
+ "sts_pearson": 0.7716241234848017,
1899
+ "retrieval_recall_at_1": 0.67,
1900
+ "retrieval_recall_at_5": 0.934,
1901
+ "retrieval_recall_at_10": 0.976,
1902
+ "nli_accuracy": 0.5,
1903
+ "nli_similarity": 0.7223896384239197,
1904
+ "paraphrase_accuracy": 0.5,
1905
+ "paraphrase_f1": 0.6666666666666666,
1906
+ "paraphrase_similarity": 0.8632831573486328,
1907
+ "composite_score": 0.7962618026137975
1908
+ },
1909
+ "losses": {
1910
+ "sts": {
1911
+ "total": 0.03940606570762137,
1912
+ "distill": 0.0005587598197567074,
1913
+ "task": 0.13509494876084122,
1914
+ "count": 23
1915
+ },
1916
+ "retrieval": {
1917
+ "total": 0.06665098445212587,
1918
+ "distill": 0.0006136801812124062,
1919
+ "task": 0.3050802118600683,
1920
+ "count": 47
1921
+ },
1922
+ "nli": {
1923
+ "total": 0.18052284007376812,
1924
+ "distill": 0.00040472789662592906,
1925
+ "task": 1.2418234031251136,
1926
+ "count": 47
1927
+ },
1928
+ "paraphrase": {
1929
+ "total": 0.015592311229556798,
1930
+ "distill": 0.0006917931721545755,
1931
+ "task": 0.21204619854688644,
1932
+ "count": 10
1933
+ }
1934
+ },
1935
+ "distill_weight": 0.2736
1936
+ },
1937
+ {
1938
+ "epoch": 46,
1939
+ "eval_results": {
1940
+ "sts_spearman": 0.7980845899126008,
1941
+ "sts_pearson": 0.7708174356684444,
1942
+ "retrieval_recall_at_1": 0.662,
1943
+ "retrieval_recall_at_5": 0.934,
1944
+ "retrieval_recall_at_10": 0.974,
1945
+ "nli_accuracy": 0.5,
1946
+ "nli_similarity": 0.7227649688720703,
1947
+ "paraphrase_accuracy": 0.5,
1948
+ "paraphrase_f1": 0.6666666666666666,
1949
+ "paraphrase_similarity": 0.8638308644294739,
1950
+ "composite_score": 0.7959089616229671
1951
+ },
1952
+ "losses": {
1953
+ "sts": {
1954
+ "total": 0.03914242918076723,
1955
+ "distill": 0.0005580810273228133,
1956
+ "task": 0.13407865684965384,
1957
+ "count": 23
1958
+ },
1959
+ "retrieval": {
1960
+ "total": 0.06371399538314089,
1961
+ "distill": 0.0006122274517497801,
1962
+ "task": 0.2913656944924213,
1963
+ "count": 47
1964
+ },
1965
+ "nli": {
1966
+ "total": 0.1788206826499168,
1967
+ "distill": 0.00040350851176821804,
1968
+ "task": 1.2290957481303113,
1969
+ "count": 47
1970
+ },
1971
+ "paraphrase": {
1972
+ "total": 0.016117287054657935,
1973
+ "distill": 0.0006897407933138311,
1974
+ "task": 0.2191057413816452,
1975
+ "count": 10
1976
+ }
1977
+ },
1978
+ "distill_weight": 0.273
1979
+ },
1980
+ {
1981
+ "epoch": 47,
1982
+ "eval_results": {
1983
+ "sts_spearman": 0.7986050030831314,
1984
+ "sts_pearson": 0.7713344721815415,
1985
+ "retrieval_recall_at_1": 0.668,
1986
+ "retrieval_recall_at_5": 0.936,
1987
+ "retrieval_recall_at_10": 0.974,
1988
+ "nli_accuracy": 0.5,
1989
+ "nli_similarity": 0.7211058139801025,
1990
+ "paraphrase_accuracy": 0.5,
1991
+ "paraphrase_f1": 0.6666666666666666,
1992
+ "paraphrase_similarity": 0.8641474843025208,
1993
+ "composite_score": 0.7967691682082324
1994
+ },
1995
+ "losses": {
1996
+ "sts": {
1997
+ "total": 0.03924534291676853,
1998
+ "distill": 0.0005579624380713896,
1999
+ "task": 0.1343229581480441,
2000
+ "count": 23
2001
+ },
2002
+ "retrieval": {
2003
+ "total": 0.06390803307294846,
2004
+ "distill": 0.0006106501616081817,
2005
+ "task": 0.29201801152939494,
2006
+ "count": 47
2007
+ },
2008
+ "nli": {
2009
+ "total": 0.17669346231095334,
2010
+ "distill": 0.0004022118878213966,
2011
+ "task": 1.2134682447352307,
2012
+ "count": 47
2013
+ },
2014
+ "paraphrase": {
2015
+ "total": 0.014580465480685234,
2016
+ "distill": 0.0006913242454174906,
2017
+ "task": 0.1978030323982239,
2018
+ "count": 10
2019
+ }
2020
+ },
2021
+ "distill_weight": 0.2724
2022
+ },
2023
+ {
2024
+ "epoch": 48,
2025
+ "eval_results": {
2026
+ "sts_spearman": 0.7978328086000539,
2027
+ "sts_pearson": 0.7705122749494036,
2028
+ "retrieval_recall_at_1": 0.67,
2029
+ "retrieval_recall_at_5": 0.936,
2030
+ "retrieval_recall_at_10": 0.976,
2031
+ "nli_accuracy": 0.5,
2032
+ "nli_similarity": 0.7213853001594543,
2033
+ "paraphrase_accuracy": 0.5,
2034
+ "paraphrase_f1": 0.6666666666666666,
2035
+ "paraphrase_similarity": 0.8641984462738037,
2036
+ "composite_score": 0.7963830709666937
2037
+ },
2038
+ "losses": {
2039
+ "sts": {
2040
+ "total": 0.038881031796336174,
2041
+ "distill": 0.0005576102867843988,
2042
+ "task": 0.13296303638945456,
2043
+ "count": 23
2044
+ },
2045
+ "retrieval": {
2046
+ "total": 0.06267972861198669,
2047
+ "distill": 0.0006095131591675764,
2048
+ "task": 0.2861579453691523,
2049
+ "count": 47
2050
+ },
2051
+ "nli": {
2052
+ "total": 0.17544851499669095,
2053
+ "distill": 0.0004012782276518881,
2054
+ "task": 1.2039236971672544,
2055
+ "count": 47
2056
+ },
2057
+ "paraphrase": {
2058
+ "total": 0.015775612369179726,
2059
+ "distill": 0.0006886643706820906,
2060
+ "task": 0.2140680193901062,
2061
+ "count": 10
2062
+ }
2063
+ },
2064
+ "distill_weight": 0.2718
2065
+ },
2066
+ {
2067
+ "epoch": 49,
2068
+ "eval_results": {
2069
+ "sts_spearman": 0.7973172210556293,
2070
+ "sts_pearson": 0.7701713894839444,
2071
+ "retrieval_recall_at_1": 0.67,
2072
+ "retrieval_recall_at_5": 0.936,
2073
+ "retrieval_recall_at_10": 0.976,
2074
+ "nli_accuracy": 0.5,
2075
+ "nli_similarity": 0.7203776240348816,
2076
+ "paraphrase_accuracy": 0.5,
2077
+ "paraphrase_f1": 0.6666666666666666,
2078
+ "paraphrase_similarity": 0.8631545305252075,
2079
+ "composite_score": 0.7961252771944813
2080
+ },
2081
+ "losses": {
2082
+ "sts": {
2083
+ "total": 0.03882810230488363,
2084
+ "distill": 0.0005557498307493718,
2085
+ "task": 0.13267488712849823,
2086
+ "count": 23
2087
+ },
2088
+ "retrieval": {
2089
+ "total": 0.0626571145146451,
2090
+ "distill": 0.000607952565955434,
2091
+ "task": 0.28582252847387435,
2092
+ "count": 47
2093
+ },
2094
+ "nli": {
2095
+ "total": 0.1763826126747943,
2096
+ "distill": 0.0004007951009701541,
2097
+ "task": 1.2093435627348879,
2098
+ "count": 47
2099
+ },
2100
+ "paraphrase": {
2101
+ "total": 0.015177728608250618,
2102
+ "distill": 0.0006882219109684229,
2103
+ "task": 0.20569542646408082,
2104
+ "count": 10
2105
+ }
2106
+ },
2107
+ "distill_weight": 0.2712
2108
+ },
2109
+ {
2110
+ "epoch": 50,
2111
+ "eval_results": {
2112
+ "sts_spearman": 0.7968594218073127,
2113
+ "sts_pearson": 0.7695483662129042,
2114
+ "retrieval_recall_at_1": 0.67,
2115
+ "retrieval_recall_at_5": 0.936,
2116
+ "retrieval_recall_at_10": 0.974,
2117
+ "nli_accuracy": 0.5,
2118
+ "nli_similarity": 0.7204389572143555,
2119
+ "paraphrase_accuracy": 0.5,
2120
+ "paraphrase_f1": 0.6666666666666666,
2121
+ "paraphrase_similarity": 0.8636235594749451,
2122
+ "composite_score": 0.795896377570323
2123
+ },
2124
+ "losses": {
2125
+ "sts": {
2126
+ "total": 0.03918402480042499,
2127
+ "distill": 0.0005560298273137406,
2128
+ "task": 0.13378654711920282,
2129
+ "count": 23
2130
+ },
2131
+ "retrieval": {
2132
+ "total": 0.06285070390143294,
2133
+ "distill": 0.000605922420211929,
2134
+ "task": 0.2864762779245985,
2135
+ "count": 47
2136
+ },
2137
+ "nli": {
2138
+ "total": 0.1742387483728693,
2139
+ "distill": 0.00039892817941553733,
2140
+ "task": 1.1936577951654475,
2141
+ "count": 47
2142
+ },
2143
+ "paraphrase": {
2144
+ "total": 0.014562566205859185,
2145
+ "distill": 0.0006869392120279372,
2146
+ "task": 0.197102826833725,
2147
+ "count": 10
2148
+ }
2149
+ },
2150
+ "distill_weight": 0.2706
2151
+ },
2152
+ {
2153
+ "epoch": 51,
2154
+ "eval_results": {
2155
+ "sts_spearman": 0.7961018874767011,
2156
+ "sts_pearson": 0.7685896198859151,
2157
+ "retrieval_recall_at_1": 0.672,
2158
+ "retrieval_recall_at_5": 0.938,
2159
+ "retrieval_recall_at_10": 0.976,
2160
+ "nli_accuracy": 0.5,
2161
+ "nli_similarity": 0.7212554216384888,
2162
+ "paraphrase_accuracy": 0.5,
2163
+ "paraphrase_f1": 0.6666666666666666,
2164
+ "paraphrase_similarity": 0.8639085292816162,
2165
+ "composite_score": 0.7961176104050172
2166
+ },
2167
+ "losses": {
2168
+ "sts": {
2169
+ "total": 0.038832955305343086,
2170
+ "distill": 0.000555693042849231,
2171
+ "task": 0.13247574898211853,
2172
+ "count": 23
2173
+ },
2174
+ "retrieval": {
2175
+ "total": 0.061906919121108156,
2176
+ "distill": 0.0006047165407700107,
2177
+ "task": 0.2819344585246228,
2178
+ "count": 47
2179
+ },
2180
+ "nli": {
2181
+ "total": 0.17475084611710082,
2182
+ "distill": 0.0003985753827827408,
2183
+ "task": 1.1961865387064345,
2184
+ "count": 47
2185
+ },
2186
+ "paraphrase": {
2187
+ "total": 0.014561035577207804,
2188
+ "distill": 0.0006853513652458787,
2189
+ "task": 0.19693138003349303,
2190
+ "count": 10
2191
+ }
2192
+ },
2193
+ "distill_weight": 0.27
2194
+ },
2195
+ {
2196
+ "epoch": 52,
2197
+ "eval_results": {
2198
+ "sts_spearman": 0.7957712004157503,
2199
+ "sts_pearson": 0.7682444812474918,
2200
+ "retrieval_recall_at_1": 0.668,
2201
+ "retrieval_recall_at_5": 0.938,
2202
+ "retrieval_recall_at_10": 0.976,
2203
+ "nli_accuracy": 0.5,
2204
+ "nli_similarity": 0.7211222052574158,
2205
+ "paraphrase_accuracy": 0.5,
2206
+ "paraphrase_f1": 0.6666666666666666,
2207
+ "paraphrase_similarity": 0.8639959096908569,
2208
+ "composite_score": 0.7959522668745418
2209
+ },
2210
+ "losses": {
2211
+ "sts": {
2212
+ "total": 0.038104739849982056,
2213
+ "distill": 0.0005552242105097874,
2214
+ "task": 0.12987668682699618,
2215
+ "count": 23
2216
+ },
2217
+ "retrieval": {
2218
+ "total": 0.06224994496145147,
2219
+ "distill": 0.0006037014598482625,
2220
+ "task": 0.2832708587037756,
2221
+ "count": 47
2222
+ },
2223
+ "nli": {
2224
+ "total": 0.17524569560872746,
2225
+ "distill": 0.00039716928756419333,
2226
+ "task": 1.198594996269713,
2227
+ "count": 47
2228
+ },
2229
+ "paraphrase": {
2230
+ "total": 0.01391328014433384,
2231
+ "distill": 0.0006847139564342796,
2232
+ "task": 0.18791155964136125,
2233
+ "count": 10
2234
+ }
2235
+ },
2236
+ "distill_weight": 0.2694
2237
+ },
2238
+ {
2239
+ "epoch": 53,
2240
+ "eval_results": {
2241
+ "sts_spearman": 0.7948627602870467,
2242
+ "sts_pearson": 0.7672766924578068,
2243
+ "retrieval_recall_at_1": 0.674,
2244
+ "retrieval_recall_at_5": 0.936,
2245
+ "retrieval_recall_at_10": 0.974,
2246
+ "nli_accuracy": 0.5,
2247
+ "nli_similarity": 0.7208473682403564,
2248
+ "paraphrase_accuracy": 0.5,
2249
+ "paraphrase_f1": 0.6666666666666666,
2250
+ "paraphrase_similarity": 0.8635191321372986,
2251
+ "composite_score": 0.79489804681019
2252
+ },
2253
+ "losses": {
2254
+ "sts": {
2255
+ "total": 0.03807506864161595,
2256
+ "distill": 0.0005540588349306389,
2257
+ "task": 0.12967087879129077,
2258
+ "count": 23
2259
+ },
2260
+ "retrieval": {
2261
+ "total": 0.06309370720323096,
2262
+ "distill": 0.0006028104070673122,
2263
+ "task": 0.2868876425509757,
2264
+ "count": 47
2265
+ },
2266
+ "nli": {
2267
+ "total": 0.17431603213574023,
2268
+ "distill": 0.00039660765918249146,
2269
+ "task": 1.1912570329422647,
2270
+ "count": 47
2271
+ },
2272
+ "paraphrase": {
2273
+ "total": 0.013855330273509025,
2274
+ "distill": 0.0006855725019704551,
2275
+ "task": 0.18696729764342307,
2276
+ "count": 10
2277
+ }
2278
+ },
2279
+ "distill_weight": 0.2688
2280
+ },
2281
+ {
2282
+ "epoch": 54,
2283
+ "eval_results": {
2284
+ "sts_spearman": 0.7948670645604796,
2285
+ "sts_pearson": 0.7673114563945611,
2286
+ "retrieval_recall_at_1": 0.672,
2287
+ "retrieval_recall_at_5": 0.936,
2288
+ "retrieval_recall_at_10": 0.976,
2289
+ "nli_accuracy": 0.5,
2290
+ "nli_similarity": 0.7204586267471313,
2291
+ "paraphrase_accuracy": 0.5,
2292
+ "paraphrase_f1": 0.6666666666666666,
2293
+ "paraphrase_similarity": 0.8637556433677673,
2294
+ "composite_score": 0.7949001989469064
2295
+ },
2296
+ "losses": {
2297
+ "sts": {
2298
+ "total": 0.03859076422193776,
2299
+ "distill": 0.000553890953913493,
2300
+ "task": 0.1313275915125142,
2301
+ "count": 23
2302
+ },
2303
+ "retrieval": {
2304
+ "total": 0.06243176108345072,
2305
+ "distill": 0.000601062821125255,
2306
+ "task": 0.2836410492024523,
2307
+ "count": 47
2308
+ },
2309
+ "nli": {
2310
+ "total": 0.17191652573169547,
2311
+ "distill": 0.0003955346752612039,
2312
+ "task": 1.173889383356622,
2313
+ "count": 47
2314
+ },
2315
+ "paraphrase": {
2316
+ "total": 0.014081115927547216,
2317
+ "distill": 0.0006832460989244282,
2318
+ "task": 0.1899135023355484,
2319
+ "count": 10
2320
+ }
2321
+ },
2322
+ "distill_weight": 0.2682
2323
+ },
2324
+ {
2325
+ "epoch": 55,
2326
+ "eval_results": {
2327
+ "sts_spearman": 0.7940184198467072,
2328
+ "sts_pearson": 0.7664062690593959,
2329
+ "retrieval_recall_at_1": 0.67,
2330
+ "retrieval_recall_at_5": 0.936,
2331
+ "retrieval_recall_at_10": 0.974,
2332
+ "nli_accuracy": 0.5,
2333
+ "nli_similarity": 0.7194271087646484,
2334
+ "paraphrase_accuracy": 0.5,
2335
+ "paraphrase_f1": 0.6666666666666666,
2336
+ "paraphrase_similarity": 0.8639047145843506,
2337
+ "composite_score": 0.7944758765900203
2338
+ },
2339
+ "losses": {
2340
+ "sts": {
2341
+ "total": 0.03801367480469787,
2342
+ "distill": 0.0005527152021860946,
2343
+ "task": 0.12925235538379007,
2344
+ "count": 23
2345
+ },
2346
+ "retrieval": {
2347
+ "total": 0.06092823788206628,
2348
+ "distill": 0.0005999853217756336,
2349
+ "task": 0.276568721900595,
2350
+ "count": 47
2351
+ },
2352
+ "nli": {
2353
+ "total": 0.1702281376148792,
2354
+ "distill": 0.0003946685254514376,
2355
+ "task": 1.161404493007254,
2356
+ "count": 47
2357
+ },
2358
+ "paraphrase": {
2359
+ "total": 0.012735517043620349,
2360
+ "distill": 0.0006815990433096886,
2361
+ "task": 0.17139707654714584,
2362
+ "count": 10
2363
+ }
2364
+ },
2365
+ "distill_weight": 0.2676
2366
+ },
2367
+ {
2368
+ "epoch": 56,
2369
+ "eval_results": {
2370
+ "sts_spearman": 0.7940615711220378,
2371
+ "sts_pearson": 0.7665661594709636,
2372
+ "retrieval_recall_at_1": 0.672,
2373
+ "retrieval_recall_at_5": 0.94,
2374
+ "retrieval_recall_at_10": 0.976,
2375
+ "nli_accuracy": 0.5,
2376
+ "nli_similarity": 0.7197932004928589,
2377
+ "paraphrase_accuracy": 0.5,
2378
+ "paraphrase_f1": 0.6666666666666666,
2379
+ "paraphrase_similarity": 0.8638368248939514,
2380
+ "composite_score": 0.7956974522276856
2381
+ },
2382
+ "losses": {
2383
+ "sts": {
2384
+ "total": 0.03830795994271403,
2385
+ "distill": 0.0005517558893188834,
2386
+ "task": 0.13015226015578146,
2387
+ "count": 23
2388
+ },
2389
+ "retrieval": {
2390
+ "total": 0.060107594633356055,
2391
+ "distill": 0.0005995195837037519,
2392
+ "task": 0.27261265953804587,
2393
+ "count": 47
2394
+ },
2395
+ "nli": {
2396
+ "total": 0.1709233709472291,
2397
+ "distill": 0.0003937529740567775,
2398
+ "task": 1.165199505521896,
2399
+ "count": 47
2400
+ },
2401
+ "paraphrase": {
2402
+ "total": 0.014191682077944278,
2403
+ "distill": 0.0006824985903222114,
2404
+ "task": 0.19112490341067315,
2405
+ "count": 10
2406
+ }
2407
+ },
2408
+ "distill_weight": 0.267
2409
+ },
2410
+ {
2411
+ "epoch": 57,
2412
+ "eval_results": {
2413
+ "sts_spearman": 0.7937036029034699,
2414
+ "sts_pearson": 0.7662776909565029,
2415
+ "retrieval_recall_at_1": 0.672,
2416
+ "retrieval_recall_at_5": 0.942,
2417
+ "retrieval_recall_at_10": 0.978,
2418
+ "nli_accuracy": 0.5,
2419
+ "nli_similarity": 0.7196366190910339,
2420
+ "paraphrase_accuracy": 0.5,
2421
+ "paraphrase_f1": 0.6666666666666666,
2422
+ "paraphrase_similarity": 0.864041268825531,
2423
+ "composite_score": 0.7961184681184016
2424
+ },
2425
+ "losses": {
2426
+ "sts": {
2427
+ "total": 0.038414100913897804,
2428
+ "distill": 0.0005509398837128411,
2429
+ "task": 0.1304093819597493,
2430
+ "count": 23
2431
+ },
2432
+ "retrieval": {
2433
+ "total": 0.057978192938769116,
2434
+ "distill": 0.0005975660850650611,
2435
+ "task": 0.26271810398456896,
2436
+ "count": 47
2437
+ },
2438
+ "nli": {
2439
+ "total": 0.17013170490873622,
2440
+ "distill": 0.00039284772232194687,
2441
+ "task": 1.1588538897798417,
2442
+ "count": 47
2443
+ },
2444
+ "paraphrase": {
2445
+ "total": 0.014643667824566364,
2446
+ "distill": 0.0006814012362156063,
2447
+ "task": 0.19713934063911437,
2448
+ "count": 10
2449
+ }
2450
+ },
2451
+ "distill_weight": 0.26639999999999997
2452
+ },
2453
+ {
2454
+ "epoch": 58,
2455
+ "eval_results": {
2456
+ "sts_spearman": 0.7928266663552261,
2457
+ "sts_pearson": 0.765244563485406,
2458
+ "retrieval_recall_at_1": 0.678,
2459
+ "retrieval_recall_at_5": 0.94,
2460
+ "retrieval_recall_at_10": 0.976,
2461
+ "nli_accuracy": 0.5,
2462
+ "nli_similarity": 0.7203112840652466,
2463
+ "paraphrase_accuracy": 0.5,
2464
+ "paraphrase_f1": 0.6666666666666666,
2465
+ "paraphrase_similarity": 0.8644272685050964,
2466
+ "composite_score": 0.7950799998442797
2467
+ },
2468
+ "losses": {
2469
+ "sts": {
2470
+ "total": 0.03789730612998423,
2471
+ "distill": 0.0005501627937242713,
2472
+ "task": 0.1285449188688527,
2473
+ "count": 23
2474
+ },
2475
+ "retrieval": {
2476
+ "total": 0.06108237723720834,
2477
+ "distill": 0.0005970121710561216,
2478
+ "task": 0.27659898869534755,
2479
+ "count": 47
2480
+ },
2481
+ "nli": {
2482
+ "total": 0.1709375616083754,
2483
+ "distill": 0.0003915692932230044,
2484
+ "task": 1.1633987756485635,
2485
+ "count": 47
2486
+ },
2487
+ "paraphrase": {
2488
+ "total": 0.013374552130699158,
2489
+ "distill": 0.0006797625974286348,
2490
+ "task": 0.17970404177904128,
2491
+ "count": 10
2492
+ }
2493
+ },
2494
+ "distill_weight": 0.26580000000000004
2495
+ }
2496
+ ]
training_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:da40e0dea3f19b04228b37c4be914930e4240d58d39997397aaf31e334bb837e
3
+ size 2583