Lakoc commited on
Commit
e10995c
·
1 Parent(s): 0392540

Upload tokenizer

Browse files
Files changed (3) hide show
  1. special_tokens_map.json +7 -0
  2. tokenizer.json +2171 -0
  3. tokenizer_config.json +10 -0
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "eos_token": "</s>",
4
+ "mask_token": "<mask>",
5
+ "pad_token": "<pad>",
6
+ "unk_token": "<unk>"
7
+ }
tokenizer.json ADDED
@@ -0,0 +1,2171 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "version": "1.0",
3
+ "truncation": null,
4
+ "padding": null,
5
+ "added_tokens": [
6
+ {
7
+ "id": 0,
8
+ "content": "<s>",
9
+ "single_word": false,
10
+ "lstrip": false,
11
+ "rstrip": false,
12
+ "normalized": false,
13
+ "special": true
14
+ },
15
+ {
16
+ "id": 1,
17
+ "content": "</s>",
18
+ "single_word": false,
19
+ "lstrip": false,
20
+ "rstrip": false,
21
+ "normalized": false,
22
+ "special": true
23
+ },
24
+ {
25
+ "id": 2,
26
+ "content": "<unk>",
27
+ "single_word": false,
28
+ "lstrip": false,
29
+ "rstrip": false,
30
+ "normalized": false,
31
+ "special": true
32
+ },
33
+ {
34
+ "id": 3,
35
+ "content": "<pad>",
36
+ "single_word": false,
37
+ "lstrip": false,
38
+ "rstrip": false,
39
+ "normalized": false,
40
+ "special": true
41
+ },
42
+ {
43
+ "id": 4,
44
+ "content": "<mask>",
45
+ "single_word": false,
46
+ "lstrip": false,
47
+ "rstrip": false,
48
+ "normalized": false,
49
+ "special": true
50
+ }
51
+ ],
52
+ "normalizer": {
53
+ "type": "Sequence",
54
+ "normalizers": [
55
+ {
56
+ "type": "Replace",
57
+ "pattern": {
58
+ "String": "``"
59
+ },
60
+ "content": "\""
61
+ },
62
+ {
63
+ "type": "Replace",
64
+ "pattern": {
65
+ "String": "''"
66
+ },
67
+ "content": "\""
68
+ },
69
+ {
70
+ "type": "Lowercase"
71
+ }
72
+ ]
73
+ },
74
+ "pre_tokenizer": {
75
+ "type": "Metaspace",
76
+ "replacement": "▁",
77
+ "add_prefix_space": true
78
+ },
79
+ "post_processor": {
80
+ "type": "TemplateProcessing",
81
+ "single": [
82
+ {
83
+ "SpecialToken": {
84
+ "id": "<s>",
85
+ "type_id": 0
86
+ }
87
+ },
88
+ {
89
+ "Sequence": {
90
+ "id": "A",
91
+ "type_id": 0
92
+ }
93
+ },
94
+ {
95
+ "SpecialToken": {
96
+ "id": "</s>",
97
+ "type_id": 0
98
+ }
99
+ }
100
+ ],
101
+ "pair": [
102
+ {
103
+ "SpecialToken": {
104
+ "id": "<s>",
105
+ "type_id": 0
106
+ }
107
+ },
108
+ {
109
+ "Sequence": {
110
+ "id": "A",
111
+ "type_id": 0
112
+ }
113
+ },
114
+ {
115
+ "SpecialToken": {
116
+ "id": "</s>",
117
+ "type_id": 0
118
+ }
119
+ },
120
+ {
121
+ "SpecialToken": {
122
+ "id": "<s>",
123
+ "type_id": 1
124
+ }
125
+ },
126
+ {
127
+ "Sequence": {
128
+ "id": "B",
129
+ "type_id": 1
130
+ }
131
+ },
132
+ {
133
+ "SpecialToken": {
134
+ "id": "</s>",
135
+ "type_id": 1
136
+ }
137
+ }
138
+ ],
139
+ "special_tokens": {
140
+ "</s>": {
141
+ "id": "</s>",
142
+ "ids": [
143
+ 1
144
+ ],
145
+ "tokens": [
146
+ "</s>"
147
+ ]
148
+ },
149
+ "<s>": {
150
+ "id": "<s>",
151
+ "ids": [
152
+ 0
153
+ ],
154
+ "tokens": [
155
+ "<s>"
156
+ ]
157
+ }
158
+ }
159
+ },
160
+ "decoder": {
161
+ "type": "Metaspace",
162
+ "replacement": "▁",
163
+ "add_prefix_space": true
164
+ },
165
+ "model": {
166
+ "type": "Unigram",
167
+ "unk_id": 2,
168
+ "vocab": [
169
+ [
170
+ "<s>",
171
+ 0.0
172
+ ],
173
+ [
174
+ "</s>",
175
+ 0.0
176
+ ],
177
+ [
178
+ "<unk>",
179
+ 0.0
180
+ ],
181
+ [
182
+ "<pad>",
183
+ 0.0
184
+ ],
185
+ [
186
+ "<mask>",
187
+ 0.0
188
+ ],
189
+ [
190
+ "▁",
191
+ -2.365659253373355
192
+ ],
193
+ [
194
+ "e",
195
+ -2.772627425707178
196
+ ],
197
+ [
198
+ "s",
199
+ -2.809259472670938
200
+ ],
201
+ [
202
+ "t",
203
+ -2.937906677759429
204
+ ],
205
+ [
206
+ "a",
207
+ -3.0234880395661925
208
+ ],
209
+ [
210
+ "i",
211
+ -3.130101696517442
212
+ ],
213
+ [
214
+ "r",
215
+ -3.316708085002901
216
+ ],
217
+ [
218
+ "o",
219
+ -3.489081473090966
220
+ ],
221
+ [
222
+ "n",
223
+ -3.616623346672405
224
+ ],
225
+ [
226
+ "d",
227
+ -3.6620767182818454
228
+ ],
229
+ [
230
+ "▁the",
231
+ -3.836275190789124
232
+ ],
233
+ [
234
+ "\n",
235
+ -3.8534604095366625
236
+ ],
237
+ [
238
+ "l",
239
+ -3.8829237206556506
240
+ ],
241
+ [
242
+ "c",
243
+ -4.143267028826482
244
+ ],
245
+ [
246
+ "m",
247
+ -4.228727291757149
248
+ ],
249
+ [
250
+ "u",
251
+ -4.335524745161635
252
+ ],
253
+ [
254
+ "p",
255
+ -4.35957510122415
256
+ ],
257
+ [
258
+ "▁to",
259
+ -4.386513688410057
260
+ ],
261
+ [
262
+ "ing",
263
+ -4.483100337757907
264
+ ],
265
+ [
266
+ "▁and",
267
+ -4.520970283112552
268
+ ],
269
+ [
270
+ "f",
271
+ -4.525275602775817
272
+ ],
273
+ [
274
+ "▁of",
275
+ -4.576052711917756
276
+ ],
277
+ [
278
+ "y",
279
+ -4.58382387353196
280
+ ],
281
+ [
282
+ "▁in",
283
+ -4.740397653766333
284
+ ],
285
+ [
286
+ "g",
287
+ -4.791754116540238
288
+ ],
289
+ [
290
+ "h",
291
+ -4.836474590087416
292
+ ],
293
+ [
294
+ "b",
295
+ -4.909411232248008
296
+ ],
297
+ [
298
+ "▁that",
299
+ -5.1999121155936905
300
+ ],
301
+ [
302
+ "k",
303
+ -5.263461955826326
304
+ ],
305
+ [
306
+ "w",
307
+ -5.2812369844028755
308
+ ],
309
+ [
310
+ "▁is",
311
+ -5.299636845493691
312
+ ],
313
+ [
314
+ "▁be",
315
+ -5.41970291054162
316
+ ],
317
+ [
318
+ "in",
319
+ -5.435910346470935
320
+ ],
321
+ [
322
+ "▁for",
323
+ -5.512356650709734
324
+ ],
325
+ [
326
+ "v",
327
+ -5.604824905017548
328
+ ],
329
+ [
330
+ "▁we",
331
+ -5.610244633591748
332
+ ],
333
+ [
334
+ "ly",
335
+ -5.62606210143764
336
+ ],
337
+ [
338
+ "▁you",
339
+ -5.743884888035879
340
+ ],
341
+ [
342
+ "▁on",
343
+ -5.812146705563082
344
+ ],
345
+ [
346
+ "▁he",
347
+ -5.884272913006608
348
+ ],
349
+ [
350
+ "▁are",
351
+ -5.911223094826363
352
+ ],
353
+ [
354
+ "▁as",
355
+ -5.929833784333782
356
+ ],
357
+ [
358
+ "▁was",
359
+ -5.944656786916925
360
+ ],
361
+ [
362
+ "▁with",
363
+ -5.946628118980286
364
+ ],
365
+ [
366
+ "ve",
367
+ -6.029713712838147
368
+ ],
369
+ [
370
+ "▁have",
371
+ -6.056062667440749
372
+ ],
373
+ [
374
+ "▁but",
375
+ -6.078495912607522
376
+ ],
377
+ [
378
+ "▁do",
379
+ -6.120033501363496
380
+ ],
381
+ [
382
+ "▁this",
383
+ -6.135130124131255
384
+ ],
385
+ [
386
+ "ur",
387
+ -6.159085253292046
388
+ ],
389
+ [
390
+ "▁co",
391
+ -6.163495625813544
392
+ ],
393
+ [
394
+ "▁not",
395
+ -6.168869881961035
396
+ ],
397
+ [
398
+ "ation",
399
+ -6.174891736114821
400
+ ],
401
+ [
402
+ "▁they",
403
+ -6.230896897497017
404
+ ],
405
+ [
406
+ "us",
407
+ -6.2638805011825
408
+ ],
409
+ [
410
+ "▁one",
411
+ -6.280948860315334
412
+ ],
413
+ [
414
+ "'s",
415
+ -6.287063313479727
416
+ ],
417
+ [
418
+ "▁or",
419
+ -6.2957272848395895
420
+ ],
421
+ [
422
+ "▁ma",
423
+ -6.332840220336108
424
+ ],
425
+ [
426
+ "▁me",
427
+ -6.34268211129003
428
+ ],
429
+ [
430
+ "▁can",
431
+ -6.347928769198868
432
+ ],
433
+ [
434
+ "▁an",
435
+ -6.386687084496506
436
+ ],
437
+ [
438
+ "▁con",
439
+ -6.411591977758153
440
+ ],
441
+ [
442
+ "ll",
443
+ -6.421978033658183
444
+ ],
445
+ [
446
+ "▁pa",
447
+ -6.485767473048513
448
+ ],
449
+ [
450
+ "ck",
451
+ -6.498920060222407
452
+ ],
453
+ [
454
+ "▁from",
455
+ -6.510430181366697
456
+ ],
457
+ [
458
+ "▁said",
459
+ -6.549799059746503
460
+ ],
461
+ [
462
+ "▁all",
463
+ -6.5597607202216235
464
+ ],
465
+ [
466
+ "▁ch",
467
+ -6.569055886948085
468
+ ],
469
+ [
470
+ "▁there",
471
+ -6.58301772625385
472
+ ],
473
+ [
474
+ "▁what",
475
+ -6.586112233691303
476
+ ],
477
+ [
478
+ "▁ca",
479
+ -6.617227466784218
480
+ ],
481
+ [
482
+ "ide",
483
+ -6.630277697578061
484
+ ],
485
+ [
486
+ "'",
487
+ -6.636383868446469
488
+ ],
489
+ [
490
+ "▁lo",
491
+ -6.65839965549829
492
+ ],
493
+ [
494
+ "▁ex",
495
+ -6.680873400433894
496
+ ],
497
+ [
498
+ "▁go",
499
+ -6.684875077627916
500
+ ],
501
+ [
502
+ "▁ba",
503
+ -6.691013699795867
504
+ ],
505
+ [
506
+ "age",
507
+ -6.6920799937143265
508
+ ],
509
+ [
510
+ "▁by",
511
+ -6.697920339970192
512
+ ],
513
+ [
514
+ "▁about",
515
+ -6.705057222090158
516
+ ],
517
+ [
518
+ "▁more",
519
+ -6.712459504370528
520
+ ],
521
+ [
522
+ "▁no",
523
+ -6.724089235660834
524
+ ],
525
+ [
526
+ "▁who",
527
+ -6.728620204290847
528
+ ],
529
+ [
530
+ "▁my",
531
+ -6.738463579994525
532
+ ],
533
+ [
534
+ "▁out",
535
+ -6.743812086790616
536
+ ],
537
+ [
538
+ "men",
539
+ -6.749756175437202
540
+ ],
541
+ [
542
+ "▁will",
543
+ -6.750296473528454
544
+ ],
545
+ [
546
+ "▁people",
547
+ -6.75277964784218
548
+ ],
549
+ [
550
+ "▁their",
551
+ -6.762497277425378
552
+ ],
553
+ [
554
+ "▁pro",
555
+ -6.7784614068128235
556
+ ],
557
+ [
558
+ "rea",
559
+ -6.784288329157707
560
+ ],
561
+ [
562
+ "j",
563
+ -6.788201173652709
564
+ ],
565
+ [
566
+ "one",
567
+ -6.7947783162153215
568
+ ],
569
+ [
570
+ "ive",
571
+ -6.813780868209763
572
+ ],
573
+ [
574
+ "▁up",
575
+ -6.8206412381107615
576
+ ],
577
+ [
578
+ "▁th",
579
+ -6.828317656094256
580
+ ],
581
+ [
582
+ "per",
583
+ -6.830780397681892
584
+ ],
585
+ [
586
+ "▁when",
587
+ -6.845789034417846
588
+ ],
589
+ [
590
+ "▁like",
591
+ -6.846099446872442
592
+ ],
593
+ [
594
+ "▁has",
595
+ -6.858481313045974
596
+ ],
597
+ [
598
+ "▁two",
599
+ -6.875403572167368
600
+ ],
601
+ [
602
+ "▁her",
603
+ -6.877864882622259
604
+ ],
605
+ [
606
+ "ure",
607
+ -6.8908854099696395
608
+ ],
609
+ [
610
+ "x",
611
+ -6.8981401634485895
612
+ ],
613
+ [
614
+ "▁some",
615
+ -6.90180386306147
616
+ ],
617
+ [
618
+ "▁his",
619
+ -6.903905558474843
620
+ ],
621
+ [
622
+ "▁time",
623
+ -6.907975494420095
624
+ ],
625
+ [
626
+ "les",
627
+ -6.94060170562814
628
+ ],
629
+ [
630
+ "▁she",
631
+ -6.950713176332087
632
+ ],
633
+ [
634
+ "▁sh",
635
+ -6.952521338654103
636
+ ],
637
+ [
638
+ "red",
639
+ -6.958620516243256
640
+ ],
641
+ [
642
+ "▁see",
643
+ -6.982276000845523
644
+ ],
645
+ [
646
+ "▁would",
647
+ -6.983175910632591
648
+ ],
649
+ [
650
+ "▁get",
651
+ -6.987675091372429
652
+ ],
653
+ [
654
+ "▁ha",
655
+ -6.98870163335252
656
+ ],
657
+ [
658
+ "▁our",
659
+ -6.994954275263096
660
+ ],
661
+ [
662
+ "▁pre",
663
+ -7.001261302546554
664
+ ],
665
+ [
666
+ "▁had",
667
+ -7.008694902540039
668
+ ],
669
+ [
670
+ "▁were",
671
+ -7.014572559915727
672
+ ],
673
+ [
674
+ "▁just",
675
+ -7.015066325981131
676
+ ],
677
+ [
678
+ "▁thousand",
679
+ -7.027882847809215
680
+ ],
681
+ [
682
+ "end",
683
+ -7.032569076938101
684
+ ],
685
+ [
686
+ "▁cl",
687
+ -7.033247703700162
688
+ ],
689
+ [
690
+ "z",
691
+ -7.036404857337105
692
+ ],
693
+ [
694
+ "able",
695
+ -7.045084509819366
696
+ ],
697
+ [
698
+ "ight",
699
+ -7.049731749195809
700
+ ],
701
+ [
702
+ "▁it's",
703
+ -7.051805370713204
704
+ ],
705
+ [
706
+ "▁how",
707
+ -7.05872639193322
708
+ ],
709
+ [
710
+ "▁hundred",
711
+ -7.0596741793533795
712
+ ],
713
+ [
714
+ "▁comp",
715
+ -7.083542356209996
716
+ ],
717
+ [
718
+ "▁dis",
719
+ -7.0939142129968165
720
+ ],
721
+ [
722
+ "▁your",
723
+ -7.117440901089495
724
+ ],
725
+ [
726
+ "▁than",
727
+ -7.1288683714327075
728
+ ],
729
+ [
730
+ "▁which",
731
+ -7.129356471133217
732
+ ],
733
+ [
734
+ "▁work",
735
+ -7.129359951395436
736
+ ],
737
+ [
738
+ "▁other",
739
+ -7.13597630002619
740
+ ],
741
+ [
742
+ "▁say",
743
+ -7.1776110833738205
744
+ ],
745
+ [
746
+ "▁vi",
747
+ -7.181149894078882
748
+ ],
749
+ [
750
+ "ver",
751
+ -7.191963860463087
752
+ ],
753
+ [
754
+ "▁cr",
755
+ -7.193242581900794
756
+ ],
757
+ [
758
+ "▁know",
759
+ -7.196656171543754
760
+ ],
761
+ [
762
+ "▁new",
763
+ -7.200727503764988
764
+ ],
765
+ [
766
+ "ther",
767
+ -7.204924714404298
768
+ ],
769
+ [
770
+ "▁been",
771
+ -7.205833241748351
772
+ ],
773
+ [
774
+ "ach",
775
+ -7.208229328292431
776
+ ],
777
+ [
778
+ "ance",
779
+ -7.208401352102117
780
+ ],
781
+ [
782
+ "com",
783
+ -7.2587815267099
784
+ ],
785
+ [
786
+ "ical",
787
+ -7.278190190681181
788
+ ],
789
+ [
790
+ "▁sta",
791
+ -7.296280193027185
792
+ ],
793
+ [
794
+ "▁make",
795
+ -7.29708015563161
796
+ ],
797
+ [
798
+ "man",
799
+ -7.297357070474005
800
+ ],
801
+ [
802
+ "▁pu",
803
+ -7.301110944798587
804
+ ],
805
+ [
806
+ "▁car",
807
+ -7.307416744513656
808
+ ],
809
+ [
810
+ "▁think",
811
+ -7.315076410486423
812
+ ],
813
+ [
814
+ "gra",
815
+ -7.327973901718588
816
+ ],
817
+ [
818
+ "▁even",
819
+ -7.32966745377299
820
+ ],
821
+ [
822
+ "▁now",
823
+ -7.334324859615652
824
+ ],
825
+ [
826
+ "▁want",
827
+ -7.338644082597584
828
+ ],
829
+ [
830
+ "▁bu",
831
+ -7.340994934917264
832
+ ],
833
+ [
834
+ "▁over",
835
+ -7.357011041337033
836
+ ],
837
+ [
838
+ "▁way",
839
+ -7.358907442350619
840
+ ],
841
+ [
842
+ "▁into",
843
+ -7.361535232495502
844
+ ],
845
+ [
846
+ "ction",
847
+ -7.370624313735979
848
+ ],
849
+ [
850
+ "▁res",
851
+ -7.370897459814936
852
+ ],
853
+ [
854
+ "tter",
855
+ -7.372367561211412
856
+ ],
857
+ [
858
+ "▁la",
859
+ -7.373978735498756
860
+ ],
861
+ [
862
+ "ful",
863
+ -7.374168968662783
864
+ ],
865
+ [
866
+ "▁because",
867
+ -7.374226327340226
868
+ ],
869
+ [
870
+ "▁nine",
871
+ -7.377940224587508
872
+ ],
873
+ [
874
+ "ell",
875
+ -7.381683580843701
876
+ ],
877
+ [
878
+ "he",
879
+ -7.385884132318935
880
+ ],
881
+ [
882
+ "▁li",
883
+ -7.386083738518799
884
+ ],
885
+ [
886
+ "▁could",
887
+ -7.387987936442451
888
+ ],
889
+ [
890
+ "ence",
891
+ -7.401823391179242
892
+ ],
893
+ [
894
+ "▁very",
895
+ -7.408068976933201
896
+ ],
897
+ [
898
+ "▁ar",
899
+ -7.416235924919299
900
+ ],
901
+ [
902
+ "▁us",
903
+ -7.421022762491463
904
+ ],
905
+ [
906
+ "▁them",
907
+ -7.439007960826135
908
+ ],
909
+ [
910
+ "ze",
911
+ -7.442237617833349
912
+ ],
913
+ [
914
+ "ally",
915
+ -7.445673208127969
916
+ ],
917
+ [
918
+ "und",
919
+ -7.449077104870021
920
+ ],
921
+ [
922
+ "▁look",
923
+ -7.465920105389401
924
+ ],
925
+ [
926
+ "ving",
927
+ -7.476701160870164
928
+ ],
929
+ [
930
+ "▁use",
931
+ -7.48955001759129
932
+ ],
933
+ [
934
+ "▁need",
935
+ -7.507395353897278
936
+ ],
937
+ [
938
+ "▁most",
939
+ -7.508463507233227
940
+ ],
941
+ [
942
+ "ang",
943
+ -7.512544632371613
944
+ ],
945
+ [
946
+ "▁every",
947
+ -7.5175958100441775
948
+ ],
949
+ [
950
+ "qui",
951
+ -7.522443160037307
952
+ ],
953
+ [
954
+ "▁any",
955
+ -7.523800340825597
956
+ ],
957
+ [
958
+ "▁bi",
959
+ -7.526209973364402
960
+ ],
961
+ [
962
+ "▁cu",
963
+ -7.532985493152729
964
+ ],
965
+ [
966
+ "ill",
967
+ -7.542831181363784
968
+ ],
969
+ [
970
+ "▁only",
971
+ -7.542967005319641
972
+ ],
973
+ [
974
+ "▁its",
975
+ -7.548158954564306
976
+ ],
977
+ [
978
+ "▁take",
979
+ -7.549454035649404
980
+ ],
981
+ [
982
+ "▁day",
983
+ -7.552893231103521
984
+ ],
985
+ [
986
+ "▁part",
987
+ -7.555055024187226
988
+ ],
989
+ [
990
+ "▁back",
991
+ -7.556383210659444
992
+ ],
993
+ [
994
+ "▁three",
995
+ -7.557901106933526
996
+ ],
997
+ [
998
+ "▁going",
999
+ -7.5613480049101724
1000
+ ],
1001
+ [
1002
+ "ever",
1003
+ -7.562036821423801
1004
+ ],
1005
+ [
1006
+ "▁years",
1007
+ -7.562332353467333
1008
+ ],
1009
+ [
1010
+ "▁also",
1011
+ -7.563420504567638
1012
+ ],
1013
+ [
1014
+ "▁these",
1015
+ -7.563445901010152
1016
+ ],
1017
+ [
1018
+ "▁world",
1019
+ -7.565852728256367
1020
+ ],
1021
+ [
1022
+ "▁jo",
1023
+ -7.566271341802093
1024
+ ],
1025
+ [
1026
+ "for",
1027
+ -7.573909200353226
1028
+ ],
1029
+ [
1030
+ "ated",
1031
+ -7.57459569855356
1032
+ ],
1033
+ [
1034
+ "▁where",
1035
+ -7.582311490247024
1036
+ ],
1037
+ [
1038
+ "▁app",
1039
+ -7.5834076570982205
1040
+ ],
1041
+ [
1042
+ "ble",
1043
+ -7.62788523968794
1044
+ ],
1045
+ [
1046
+ "▁five",
1047
+ -7.631767928016332
1048
+ ],
1049
+ [
1050
+ "▁many",
1051
+ -7.635697451452419
1052
+ ],
1053
+ [
1054
+ "▁rec",
1055
+ -7.637686462277024
1056
+ ],
1057
+ [
1058
+ "▁first",
1059
+ -7.6444343801308445
1060
+ ],
1061
+ [
1062
+ "▁much",
1063
+ -7.644436171280633
1064
+ ],
1065
+ [
1066
+ "▁good",
1067
+ -7.657353544125234
1068
+ ],
1069
+ [
1070
+ "▁don't",
1071
+ -7.676856549107045
1072
+ ],
1073
+ [
1074
+ "▁ga",
1075
+ -7.692179576415018
1076
+ ],
1077
+ [
1078
+ "▁six",
1079
+ -7.692693565978267
1080
+ ],
1081
+ [
1082
+ "q",
1083
+ -7.69649380773518
1084
+ ],
1085
+ [
1086
+ "rac",
1087
+ -7.700769840354177
1088
+ ],
1089
+ [
1090
+ "▁him",
1091
+ -7.708060244213311
1092
+ ],
1093
+ [
1094
+ "▁may",
1095
+ -7.710511654127949
1096
+ ],
1097
+ [
1098
+ "▁pri",
1099
+ -7.722091815046573
1100
+ ],
1101
+ [
1102
+ "▁come",
1103
+ -7.723292409553091
1104
+ ],
1105
+ [
1106
+ "▁those",
1107
+ -7.724939069549478
1108
+ ],
1109
+ [
1110
+ "▁play",
1111
+ -7.728249278606013
1112
+ ],
1113
+ [
1114
+ "ster",
1115
+ -7.729194989675049
1116
+ ],
1117
+ [
1118
+ "▁life",
1119
+ -7.739615405532492
1120
+ ],
1121
+ [
1122
+ "led",
1123
+ -7.741277900326596
1124
+ ],
1125
+ [
1126
+ "▁mu",
1127
+ -7.743922966825705
1128
+ ],
1129
+ [
1130
+ "ries",
1131
+ -7.744512973191551
1132
+ ],
1133
+ [
1134
+ "▁four",
1135
+ -7.75524306448834
1136
+ ],
1137
+ [
1138
+ "mer",
1139
+ -7.7593768514509325
1140
+ ],
1141
+ [
1142
+ "lic",
1143
+ -7.75981012454886
1144
+ ],
1145
+ [
1146
+ "▁after",
1147
+ -7.771154353039117
1148
+ ],
1149
+ [
1150
+ "ress",
1151
+ -7.772231241897565
1152
+ ],
1153
+ [
1154
+ "▁eight",
1155
+ -7.775065267931256
1156
+ ],
1157
+ [
1158
+ "▁really",
1159
+ -7.782333664230928
1160
+ ],
1161
+ [
1162
+ "▁year",
1163
+ -7.78882010054649
1164
+ ],
1165
+ [
1166
+ "rate",
1167
+ -7.790966895572771
1168
+ ],
1169
+ [
1170
+ "▁well",
1171
+ -7.795038016798317
1172
+ ],
1173
+ [
1174
+ "▁rel",
1175
+ -7.800763301209928
1176
+ ],
1177
+ [
1178
+ "ugh",
1179
+ -7.805080671564257
1180
+ ],
1181
+ [
1182
+ "▁long",
1183
+ -7.814635923692185
1184
+ ],
1185
+ [
1186
+ "▁through",
1187
+ -7.829704400466484
1188
+ ],
1189
+ [
1190
+ "▁seven",
1191
+ -7.836407871710437
1192
+ ],
1193
+ [
1194
+ "▁down",
1195
+ -7.836418492278405
1196
+ ],
1197
+ [
1198
+ "▁right",
1199
+ -7.858358391858484
1200
+ ],
1201
+ [
1202
+ "▁gu",
1203
+ -7.860403374216327
1204
+ ],
1205
+ [
1206
+ "▁should",
1207
+ -7.878607017374009
1208
+ ],
1209
+ [
1210
+ "▁show",
1211
+ -7.891284436630825
1212
+ ],
1213
+ [
1214
+ "cent",
1215
+ -7.898714301504921
1216
+ ],
1217
+ [
1218
+ "▁imp",
1219
+ -7.900788492216909
1220
+ ],
1221
+ [
1222
+ "low",
1223
+ -7.905991712097469
1224
+ ],
1225
+ [
1226
+ "port",
1227
+ -7.914066728567359
1228
+ ],
1229
+ [
1230
+ "line",
1231
+ -7.920416602299412
1232
+ ],
1233
+ [
1234
+ "▁twenty",
1235
+ -7.933102379942531
1236
+ ],
1237
+ [
1238
+ "▁inter",
1239
+ -7.933146540231567
1240
+ ],
1241
+ [
1242
+ "▁point",
1243
+ -7.947713293554536
1244
+ ],
1245
+ [
1246
+ "▁though",
1247
+ -7.950963442024109
1248
+ ],
1249
+ [
1250
+ "▁help",
1251
+ -7.953395760731336
1252
+ ],
1253
+ [
1254
+ "unk",
1255
+ -7.964053831789915
1256
+ ],
1257
+ [
1258
+ "land",
1259
+ -7.969005005504925
1260
+ ],
1261
+ [
1262
+ "late",
1263
+ -7.974607980796549
1264
+ ],
1265
+ [
1266
+ "▁high",
1267
+ -7.979294411944048
1268
+ ],
1269
+ [
1270
+ "hol",
1271
+ -7.984142148598163
1272
+ ],
1273
+ [
1274
+ "▁something",
1275
+ -7.9879300163128
1276
+ ],
1277
+ [
1278
+ "▁start",
1279
+ -7.991521449712684
1280
+ ],
1281
+ [
1282
+ "▁great",
1283
+ -7.995695921767441
1284
+ ],
1285
+ [
1286
+ "▁did",
1287
+ -7.995769657368484
1288
+ ],
1289
+ [
1290
+ "▁own",
1291
+ -7.998997362615395
1292
+ ],
1293
+ [
1294
+ "▁still",
1295
+ -8.004415671326601
1296
+ ],
1297
+ [
1298
+ "▁give",
1299
+ -8.007682726112007
1300
+ ],
1301
+ [
1302
+ "▁change",
1303
+ -8.043547517434018
1304
+ ],
1305
+ [
1306
+ "▁live",
1307
+ -8.045269835331231
1308
+ ],
1309
+ [
1310
+ "▁mean",
1311
+ -8.051337748483258
1312
+ ],
1313
+ [
1314
+ "▁ten",
1315
+ -8.056860573632996
1316
+ ],
1317
+ [
1318
+ "ions",
1319
+ -8.056903676226458
1320
+ ],
1321
+ [
1322
+ "▁feel",
1323
+ -8.058066665774508
1324
+ ],
1325
+ [
1326
+ "dent",
1327
+ -8.06676117088663
1328
+ ],
1329
+ [
1330
+ "▁plan",
1331
+ -8.07103653206943
1332
+ ],
1333
+ [
1334
+ "▁around",
1335
+ -8.074900212064017
1336
+ ],
1337
+ [
1338
+ "▁again",
1339
+ -8.0888985558174
1340
+ ],
1341
+ [
1342
+ "ked",
1343
+ -8.090024710382108
1344
+ ],
1345
+ [
1346
+ "▁i'm",
1347
+ -8.092715477262882
1348
+ ],
1349
+ [
1350
+ "▁win",
1351
+ -8.100661165025922
1352
+ ],
1353
+ [
1354
+ "▁before",
1355
+ -8.104489788124138
1356
+ ],
1357
+ [
1358
+ "▁place",
1359
+ -8.104619689156994
1360
+ ],
1361
+ [
1362
+ "▁find",
1363
+ -8.114301047518083
1364
+ ],
1365
+ [
1366
+ "▁rep",
1367
+ -8.12231514633415
1368
+ ],
1369
+ [
1370
+ "▁old",
1371
+ -8.123425447706962
1372
+ ],
1373
+ [
1374
+ "que",
1375
+ -8.124806078291185
1376
+ ],
1377
+ [
1378
+ "▁home",
1379
+ -8.138270977097749
1380
+ ],
1381
+ [
1382
+ "▁same",
1383
+ -8.146767026460116
1384
+ ],
1385
+ [
1386
+ "▁made",
1387
+ -8.146986846010172
1388
+ ],
1389
+ [
1390
+ "ities",
1391
+ -8.150760621978183
1392
+ ],
1393
+ [
1394
+ "▁gene",
1395
+ -8.153271792321533
1396
+ ],
1397
+ [
1398
+ "▁little",
1399
+ -8.15718869620029
1400
+ ],
1401
+ [
1402
+ "▁never",
1403
+ -8.158766011413457
1404
+ ],
1405
+ [
1406
+ "▁add",
1407
+ -8.16085152557754
1408
+ ],
1409
+ [
1410
+ "▁dec",
1411
+ -8.162053869836694
1412
+ ],
1413
+ [
1414
+ "▁such",
1415
+ -8.166487774779094
1416
+ ],
1417
+ [
1418
+ "▁real",
1419
+ -8.170648079845975
1420
+ ],
1421
+ [
1422
+ "<",
1423
+ -8.174546931075485
1424
+ ],
1425
+ [
1426
+ ">",
1427
+ -8.174546931075485
1428
+ ],
1429
+ [
1430
+ "▁different",
1431
+ -8.177401003173872
1432
+ ],
1433
+ [
1434
+ "▁america",
1435
+ -8.195006004980684
1436
+ ],
1437
+ [
1438
+ "▁percent",
1439
+ -8.203669636244909
1440
+ ],
1441
+ [
1442
+ "▁happen",
1443
+ -8.217603938472326
1444
+ ],
1445
+ [
1446
+ "▁person",
1447
+ -8.220095360187257
1448
+ ],
1449
+ [
1450
+ "▁try",
1451
+ -8.221193651201803
1452
+ ],
1453
+ [
1454
+ "▁problem",
1455
+ -8.227408597224214
1456
+ ],
1457
+ [
1458
+ "▁war",
1459
+ -8.230954901256155
1460
+ ],
1461
+ [
1462
+ "▁hand",
1463
+ -8.25461086237215
1464
+ ],
1465
+ [
1466
+ "▁few",
1467
+ -8.255428618052218
1468
+ ],
1469
+ [
1470
+ "▁under",
1471
+ -8.259510957523522
1472
+ ],
1473
+ [
1474
+ "▁might",
1475
+ -8.259548828010725
1476
+ ],
1477
+ [
1478
+ "▁why",
1479
+ -8.266548799381447
1480
+ ],
1481
+ [
1482
+ "▁far",
1483
+ -8.27300403385749
1484
+ ],
1485
+ [
1486
+ "▁another",
1487
+ -8.27518811644013
1488
+ ],
1489
+ [
1490
+ "▁while",
1491
+ -8.27733269757505
1492
+ ],
1493
+ [
1494
+ "▁children",
1495
+ -8.278018743726054
1496
+ ],
1497
+ [
1498
+ "▁turn",
1499
+ -8.295468236043435
1500
+ ],
1501
+ [
1502
+ "▁hard",
1503
+ -8.319579651267668
1504
+ ],
1505
+ [
1506
+ "▁school",
1507
+ -8.32487400790388
1508
+ ],
1509
+ [
1510
+ "▁system",
1511
+ -8.334528863425353
1512
+ ],
1513
+ [
1514
+ "▁fact",
1515
+ -8.340618576727207
1516
+ ],
1517
+ [
1518
+ "ship",
1519
+ -8.356785263078967
1520
+ ],
1521
+ [
1522
+ "▁fun",
1523
+ -8.357450339001218
1524
+ ],
1525
+ [
1526
+ "▁found",
1527
+ -8.357664922206965
1528
+ ],
1529
+ [
1530
+ "▁talk",
1531
+ -8.360321901371405
1532
+ ],
1533
+ [
1534
+ "▁always",
1535
+ -8.362459653054767
1536
+ ],
1537
+ [
1538
+ "▁water",
1539
+ -8.366181844042668
1540
+ ],
1541
+ [
1542
+ "▁kind",
1543
+ -8.370441708712258
1544
+ ],
1545
+ [
1546
+ "▁power",
1547
+ -8.407352983466403
1548
+ ],
1549
+ [
1550
+ "serv",
1551
+ -8.41725719465188
1552
+ ],
1553
+ [
1554
+ "▁human",
1555
+ -8.422197729087955
1556
+ ],
1557
+ [
1558
+ "▁thirty",
1559
+ -8.424889240542301
1560
+ ],
1561
+ [
1562
+ "▁move",
1563
+ -8.425313917158078
1564
+ ],
1565
+ [
1566
+ "▁develop",
1567
+ -8.432217995201656
1568
+ ],
1569
+ [
1570
+ "▁country",
1571
+ -8.437154694153362
1572
+ ],
1573
+ [
1574
+ "bility",
1575
+ -8.442062543843866
1576
+ ],
1577
+ [
1578
+ "▁trans",
1579
+ -8.445491134571299
1580
+ ],
1581
+ [
1582
+ "▁keep",
1583
+ -8.447121538590643
1584
+ ],
1585
+ [
1586
+ "▁between",
1587
+ -8.450074712109995
1588
+ ],
1589
+ [
1590
+ "▁question",
1591
+ -8.451327047455067
1592
+ ],
1593
+ [
1594
+ "▁blo",
1595
+ -8.457137199160451
1596
+ ],
1597
+ [
1598
+ "▁small",
1599
+ -8.464488253220344
1600
+ ],
1601
+ [
1602
+ "▁both",
1603
+ -8.465391170838785
1604
+ ],
1605
+ [
1606
+ "▁money",
1607
+ -8.471480248618423
1608
+ ],
1609
+ [
1610
+ "▁important",
1611
+ -8.474535449814985
1612
+ ],
1613
+ [
1614
+ "▁women",
1615
+ -8.488463151090526
1616
+ ],
1617
+ [
1618
+ "▁next",
1619
+ -8.499226729264011
1620
+ ],
1621
+ [
1622
+ "▁fifty",
1623
+ -8.508940876979532
1624
+ ],
1625
+ [
1626
+ "ality",
1627
+ -8.518162563343084
1628
+ ],
1629
+ [
1630
+ "▁we're",
1631
+ -8.52363471403348
1632
+ ],
1633
+ [
1634
+ "▁friend",
1635
+ -8.529359417835353
1636
+ ],
1637
+ [
1638
+ "▁family",
1639
+ -8.535293339824523
1640
+ ],
1641
+ [
1642
+ "▁without",
1643
+ -8.537235506300188
1644
+ ],
1645
+ [
1646
+ "▁away",
1647
+ -8.53847100828701
1648
+ ],
1649
+ [
1650
+ "▁build",
1651
+ -8.538871941416147
1652
+ ],
1653
+ [
1654
+ "▁lead",
1655
+ -8.541724089954
1656
+ ],
1657
+ [
1658
+ "▁today",
1659
+ -8.55651278427627
1660
+ ],
1661
+ [
1662
+ "▁number",
1663
+ -8.558202484196897
1664
+ ],
1665
+ [
1666
+ "▁large",
1667
+ -8.564258756492888
1668
+ ],
1669
+ [
1670
+ "▁health",
1671
+ -8.565300531106974
1672
+ ],
1673
+ [
1674
+ "▁learn",
1675
+ -8.567104799745978
1676
+ ],
1677
+ [
1678
+ "▁believe",
1679
+ -8.577380612888355
1680
+ ],
1681
+ [
1682
+ "▁face",
1683
+ -8.578121546300306
1684
+ ],
1685
+ [
1686
+ "ption",
1687
+ -8.585144346347152
1688
+ ],
1689
+ [
1690
+ "▁free",
1691
+ -8.592213001257285
1692
+ ],
1693
+ [
1694
+ "▁book",
1695
+ -8.599140662214907
1696
+ ],
1697
+ [
1698
+ "▁house",
1699
+ -8.602072174491207
1700
+ ],
1701
+ [
1702
+ "▁business",
1703
+ -8.603458120072421
1704
+ ],
1705
+ [
1706
+ "▁open",
1707
+ -8.624533589738139
1708
+ ],
1709
+ [
1710
+ "▁you're",
1711
+ -8.648211923200762
1712
+ ],
1713
+ [
1714
+ "▁didn't",
1715
+ -8.650732869456244
1716
+ ],
1717
+ [
1718
+ "▁research",
1719
+ -8.654318581492863
1720
+ ],
1721
+ [
1722
+ "▁government",
1723
+ -8.659900246962529
1724
+ ],
1725
+ [
1726
+ "▁enough",
1727
+ -8.66126420220329
1728
+ ],
1729
+ [
1730
+ "▁market",
1731
+ -8.667470844760478
1732
+ ],
1733
+ [
1734
+ "▁experience",
1735
+ -8.668982145927794
1736
+ ],
1737
+ [
1738
+ "▁course",
1739
+ -8.669777377978138
1740
+ ],
1741
+ [
1742
+ "▁second",
1743
+ -8.70072992421416
1744
+ ],
1745
+ [
1746
+ "▁create",
1747
+ -8.701429428455526
1748
+ ],
1749
+ [
1750
+ "▁together",
1751
+ -8.705533541005925
1752
+ ],
1753
+ [
1754
+ "▁product",
1755
+ -8.707952333543433
1756
+ ],
1757
+ [
1758
+ "▁month",
1759
+ -8.712667102719214
1760
+ ],
1761
+ [
1762
+ "▁understand",
1763
+ -8.714626952165384
1764
+ ],
1765
+ [
1766
+ "▁group",
1767
+ -8.71962233503082
1768
+ ],
1769
+ [
1770
+ "▁hope",
1771
+ -8.727612049816129
1772
+ ],
1773
+ [
1774
+ "▁word",
1775
+ -8.738163070498102
1776
+ ],
1777
+ [
1778
+ "▁actually",
1779
+ -8.739409587339791
1780
+ ],
1781
+ [
1782
+ "▁million",
1783
+ -8.741440102156789
1784
+ ],
1785
+ [
1786
+ "▁public",
1787
+ -8.742966764345946
1788
+ ],
1789
+ [
1790
+ "▁food",
1791
+ -8.752893623113769
1792
+ ],
1793
+ [
1794
+ "▁effect",
1795
+ -8.757232196017496
1796
+ ],
1797
+ [
1798
+ "▁design",
1799
+ -8.761882269915368
1800
+ ],
1801
+ [
1802
+ "▁level",
1803
+ -8.804900237478849
1804
+ ],
1805
+ [
1806
+ "▁reason",
1807
+ -8.81582996548847
1808
+ ],
1809
+ [
1810
+ "▁result",
1811
+ -8.816553476957239
1812
+ ],
1813
+ [
1814
+ "▁everything",
1815
+ -8.818899965733245
1816
+ ],
1817
+ [
1818
+ "▁direct",
1819
+ -8.836863579748083
1820
+ ],
1821
+ [
1822
+ "▁they're",
1823
+ -8.83926152871539
1824
+ ],
1825
+ [
1826
+ "▁story",
1827
+ -8.848157809410482
1828
+ ],
1829
+ [
1830
+ "▁watch",
1831
+ -8.856317693526314
1832
+ ],
1833
+ [
1834
+ "▁process",
1835
+ -8.864285937562885
1836
+ ],
1837
+ [
1838
+ "▁certain",
1839
+ -8.864810258454877
1840
+ ],
1841
+ [
1842
+ "▁moment",
1843
+ -8.874608010450416
1844
+ ],
1845
+ [
1846
+ "▁student",
1847
+ -8.891495076518085
1848
+ ],
1849
+ [
1850
+ "▁future",
1851
+ -8.903920388479653
1852
+ ],
1853
+ [
1854
+ "▁space",
1855
+ -8.907814016098664
1856
+ ],
1857
+ [
1858
+ "▁whether",
1859
+ -8.913050460769435
1860
+ ],
1861
+ [
1862
+ "▁anything",
1863
+ -8.91536679338011
1864
+ ],
1865
+ [
1866
+ "▁control",
1867
+ -8.919573217710811
1868
+ ],
1869
+ [
1870
+ "▁almost",
1871
+ -8.946550058174427
1872
+ ],
1873
+ [
1874
+ "▁support",
1875
+ -8.951967867236133
1876
+ ],
1877
+ [
1878
+ "▁walk",
1879
+ -8.955584246502465
1880
+ ],
1881
+ [
1882
+ "▁doesn't",
1883
+ -8.963873365007103
1884
+ ],
1885
+ [
1886
+ "▁information",
1887
+ -8.968889128612775
1888
+ ],
1889
+ [
1890
+ "▁social",
1891
+ -8.971342303562146
1892
+ ],
1893
+ [
1894
+ "▁follow",
1895
+ -8.974468157141692
1896
+ ],
1897
+ [
1898
+ "▁until",
1899
+ -8.990321616601868
1900
+ ],
1901
+ [
1902
+ "▁example",
1903
+ -9.001875521562733
1904
+ ],
1905
+ [
1906
+ "▁difficult",
1907
+ -9.016530669785704
1908
+ ],
1909
+ [
1910
+ "▁already",
1911
+ -9.017723103965803
1912
+ ],
1913
+ [
1914
+ "▁expect",
1915
+ -9.021784726096689
1916
+ ],
1917
+ [
1918
+ "▁energy",
1919
+ -9.024561047592892
1920
+ ],
1921
+ [
1922
+ "▁success",
1923
+ -9.028600208851312
1924
+ ],
1925
+ [
1926
+ "▁minute",
1927
+ -9.03079542531688
1928
+ ],
1929
+ [
1930
+ "▁europe",
1931
+ -9.047719522871844
1932
+ ],
1933
+ [
1934
+ "▁probably",
1935
+ -9.04821121326538
1936
+ ],
1937
+ [
1938
+ "▁project",
1939
+ -9.050811914901368
1940
+ ],
1941
+ [
1942
+ "▁sometimes",
1943
+ -9.0532715213384
1944
+ ],
1945
+ [
1946
+ "▁photo",
1947
+ -9.059860032471333
1948
+ ],
1949
+ [
1950
+ "▁patient",
1951
+ -9.0753960063218
1952
+ ],
1953
+ [
1954
+ "▁across",
1955
+ -9.081675876568868
1956
+ ],
1957
+ [
1958
+ "▁particular",
1959
+ -9.088228568291068
1960
+ ],
1961
+ [
1962
+ "▁possible",
1963
+ -9.095938491890522
1964
+ ],
1965
+ [
1966
+ "vision",
1967
+ -9.105540231265389
1968
+ ],
1969
+ [
1970
+ "▁technology",
1971
+ -9.151043704411457
1972
+ ],
1973
+ [
1974
+ "▁environment",
1975
+ -9.159697884475367
1976
+ ],
1977
+ [
1978
+ "▁political",
1979
+ -9.167264556603442
1980
+ ],
1981
+ [
1982
+ "▁themselves",
1983
+ -9.176977040696697
1984
+ ],
1985
+ [
1986
+ "position",
1987
+ -9.204917593191968
1988
+ ],
1989
+ [
1990
+ "▁strong",
1991
+ -9.205733742263194
1992
+ ],
1993
+ [
1994
+ "▁remember",
1995
+ -9.206030914810103
1996
+ ],
1997
+ [
1998
+ "▁character",
1999
+ -9.209911780520676
2000
+ ],
2001
+ [
2002
+ "▁picture",
2003
+ -9.223497104806162
2004
+ ],
2005
+ [
2006
+ "▁wonder",
2007
+ -9.231224767871154
2008
+ ],
2009
+ [
2010
+ "▁community",
2011
+ -9.241375579372445
2012
+ ],
2013
+ [
2014
+ "▁perhaps",
2015
+ -9.253591058587723
2016
+ ],
2017
+ [
2018
+ "▁economic",
2019
+ -9.25473708228169
2020
+ ],
2021
+ [
2022
+ "▁global",
2023
+ -9.257818271485332
2024
+ ],
2025
+ [
2026
+ "▁challenge",
2027
+ -9.258951607073564
2028
+ ],
2029
+ [
2030
+ "▁individual",
2031
+ -9.297649240927932
2032
+ ],
2033
+ [
2034
+ "▁suggest",
2035
+ -9.299664904893856
2036
+ ],
2037
+ [
2038
+ "▁natural",
2039
+ -9.306034554769449
2040
+ ],
2041
+ [
2042
+ "▁special",
2043
+ -9.344672135415562
2044
+ ],
2045
+ [
2046
+ "▁difference",
2047
+ -9.372803643965131
2048
+ ],
2049
+ [
2050
+ "▁especially",
2051
+ -9.410608286507571
2052
+ ],
2053
+ [
2054
+ "▁tradition",
2055
+ -9.461990845165564
2056
+ ],
2057
+ [
2058
+ "▁although",
2059
+ -9.471896386211816
2060
+ ],
2061
+ [
2062
+ "▁economy",
2063
+ -9.48714940532035
2064
+ ],
2065
+ [
2066
+ "▁potential",
2067
+ -9.555847106305508
2068
+ ],
2069
+ [
2070
+ "▁opportunity",
2071
+ -9.567421441451735
2072
+ ],
2073
+ [
2074
+ "▁university",
2075
+ -9.67815386352216
2076
+ ],
2077
+ [
2078
+ "▁significant",
2079
+ -9.941828751919749
2080
+ ],
2081
+ [
2082
+ "0",
2083
+ -13.07732294159094
2084
+ ],
2085
+ [
2086
+ "1",
2087
+ -13.452847421598856
2088
+ ],
2089
+ [
2090
+ "2",
2091
+ -13.656091927467784
2092
+ ],
2093
+ [
2094
+ "9",
2095
+ -14.171178770504888
2096
+ ],
2097
+ [
2098
+ "[",
2099
+ -14.34711496370378
2100
+ ],
2101
+ [
2102
+ "]",
2103
+ -14.378726769576598
2104
+ ],
2105
+ [
2106
+ "3",
2107
+ -14.454520550807786
2108
+ ],
2109
+ [
2110
+ "5",
2111
+ -14.675697303185675
2112
+ ],
2113
+ [
2114
+ "8",
2115
+ -14.707636103260423
2116
+ ],
2117
+ [
2118
+ "$",
2119
+ -15.036026851076924
2120
+ ],
2121
+ [
2122
+ "4",
2123
+ -15.036026851076926
2124
+ ],
2125
+ [
2126
+ "7",
2127
+ -15.187832394806914
2128
+ ],
2129
+ [
2130
+ "6",
2131
+ -15.187832394806914
2132
+ ],
2133
+ [
2134
+ "&",
2135
+ -15.635142021469548
2136
+ ],
2137
+ [
2138
+ "+",
2139
+ -17.41148545414652
2140
+ ],
2141
+ [
2142
+ "=",
2143
+ -17.612080692241786
2144
+ ],
2145
+ [
2146
+ "#",
2147
+ -17.863246193407704
2148
+ ],
2149
+ [
2150
+ "%",
2151
+ -18.342214447428272
2152
+ ],
2153
+ [
2154
+ "@",
2155
+ -18.958881114094936
2156
+ ],
2157
+ [
2158
+ "^",
2159
+ -19.79221444742827
2160
+ ],
2161
+ [
2162
+ "*",
2163
+ -20.79221444742827
2164
+ ],
2165
+ [
2166
+ "\\",
2167
+ -20.79221444742827
2168
+ ]
2169
+ ]
2170
+ }
2171
+ }
tokenizer_config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "clean_up_tokenization_spaces": true,
4
+ "eos_token": "</s>",
5
+ "mask_token": "<mask>",
6
+ "model_max_length": 1000000000000000019884624838656,
7
+ "pad_token": "<pad>",
8
+ "tokenizer_class": "PreTrainedTokenizerFast",
9
+ "unk_token": "<unk>"
10
+ }