AbstractPhil commited on
Commit
5caf138
Β·
verified Β·
1 Parent(s): 386700f

Create constellation_a_test_trainer_cifar10_output.txt

Browse files
constellation_a_test_trainer_cifar10_output.txt ADDED
@@ -0,0 +1,223 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ============================================================
2
+ GeoLIP Core β€” Conv + ConstellationCore
3
+ Encoder: 6-layer conv β†’ 256-d sphere
4
+ Constellation: 64 anchors, 8Γ—64 patchwork
5
+ Activation: squared_relu
6
+ Loss: CE + InfoNCE + attract + CV(0.22) + spread
7
+ Batch: 256, LR: 0.0003, Epochs: 100
8
+ Push: every 100 batches, lr=0.1
9
+ Device: cuda
10
+ ============================================================
11
+ Train: 50,000 Val: 10,000
12
+ Parameters: 1,706,058 (encoder: 1,213,504, core: 492,554)
13
+
14
+ ============================================================
15
+ TRAINING β€” 100 epochs
16
+ ============================================================
17
+ E 1/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.35b/s, acc=23%, cos=0.340, loss=4.4095, nce=0.63, ordered=1, push=1]
18
+ E 1: train=23.2% val=42.8% loss=4.3573 nce=0.63 cos=0.271 cv=0.2052(βœ“) anch=36/64 push=1 (11s) β˜…
19
+ E 2/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.53b/s, acc=45%, cos=0.393, loss=2.0887, nce=0.97, ordered=1, push=3]
20
+ E 2: train=45.5% val=49.6% loss=2.0830 nce=0.97 cos=0.388 cv=0.2057(βœ“) anch=41/64 push=3 (11s) β˜…
21
+ E 3/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.62b/s, acc=55%, cos=0.486, loss=1.6869, nce=0.99, ordered=1, push=5]
22
+ E 3: train=55.2% val=57.5% loss=1.6840 nce=0.99 cos=0.449 cv=0.2135(βœ“) anch=34/64 push=5 (10s) β˜…
23
+ E 4/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.58b/s, acc=62%, cos=0.520, loss=1.4310, nce=1.00, ordered=1, push=7]
24
+ E 4: train=62.0% val=65.3% loss=1.4285 nce=1.00 cos=0.503 cv=0.1861(βœ“) anch=45/64 push=7 (10s) β˜…
25
+ E 5/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.70b/s, acc=66%, cos=0.543, loss=1.2767, nce=1.00, ordered=1, push=9]
26
+ E 5: train=66.4% val=68.6% loss=1.2743 nce=1.00 cos=0.533 cv=0.1827(βœ“) anch=49/64 push=9 (10s) β˜…
27
+ E 6/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.48b/s, acc=70%, cos=0.539, loss=1.1631, nce=1.00, ordered=1, push=11]
28
+ E 6: train=70.1% val=69.0% loss=1.1617 nce=1.00 cos=0.550 cv=0.2212(βœ“) anch=56/64 push=11 (11s) β˜…
29
+ E 7/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.58b/s, acc=73%, cos=0.562, loss=1.0645, nce=1.00, ordered=1, push=13]
30
+ E 7: train=73.4% val=72.4% loss=1.0644 nce=1.00 cos=0.563 cv=0.1889(βœ“) anch=57/64 push=13 (10s) β˜…
31
+ E 8/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.39b/s, acc=75%, cos=0.582, loss=0.9999, nce=1.00, ordered=1, push=15]
32
+ E 8: train=75.2% val=74.1% loss=0.9993 nce=1.00 cos=0.573 cv=0.1984(βœ“) anch=61/64 push=15 (11s) β˜…
33
+ E 9/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.68b/s, acc=78%, cos=0.602, loss=0.9310, nce=1.00, ordered=1, push=17]
34
+ E 9: train=77.6% val=76.4% loss=0.9296 nce=1.00 cos=0.582 cv=0.1825(βœ“) anch=61/64 push=17 (10s) β˜…
35
+ E 10/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.05b/s, acc=79%, cos=0.590, loss=0.8909, nce=1.00, ordered=1, push=19]
36
+ E 10: train=78.8% val=75.8% loss=0.8898 nce=1.00 cos=0.589 cv=0.1858(βœ“) anch=61/64 push=19 (11s)
37
+ E 11/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.77b/s, acc=80%, cos=0.586, loss=0.8450, nce=1.00, ordered=1, push=21]
38
+ E 11: train=80.1% val=79.3% loss=0.8431 nce=1.00 cos=0.595 cv=0.1865(βœ“) anch=63/64 push=21 (10s) β˜…
39
+ E 12/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.75b/s, acc=81%, cos=0.605, loss=0.8123, nce=1.00, ordered=1, push=23]
40
+ E 12: train=81.1% val=76.7% loss=0.8127 nce=1.00 cos=0.603 cv=0.2004(βœ“) anch=64/64 push=23 (10s)
41
+ E 13/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.48b/s, acc=82%, cos=0.613, loss=0.7768, nce=1.00, ordered=1, push=25]
42
+ E 13: train=82.5% val=78.8% loss=0.7767 nce=1.00 cos=0.604 cv=0.1858(βœ“) anch=63/64 push=25 (11s)
43
+ E 14/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.61b/s, acc=83%, cos=0.613, loss=0.7525, nce=1.00, ordered=1, push=27]
44
+ E 14: train=83.0% val=81.9% loss=0.7522 nce=1.00 cos=0.609 cv=0.1733(βœ—) anch=64/64 push=27 (10s) β˜…
45
+ E 15/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.61b/s, acc=84%, cos=0.621, loss=0.7223, nce=1.00, ordered=1, push=29]
46
+ E 15: train=84.0% val=80.3% loss=0.7226 nce=1.00 cos=0.612 cv=0.1799(βœ—) anch=64/64 push=29 (10s)
47
+ E 16/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.66b/s, acc=85%, cos=0.613, loss=0.6988, nce=1.00, ordered=1, push=31]
48
+ E 16: train=84.9% val=81.8% loss=0.6975 nce=1.00 cos=0.612 cv=0.1844(βœ“) anch=64/64 push=31 (10s)
49
+ E 17/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.53b/s, acc=85%, cos=0.617, loss=0.6724, nce=1.00, ordered=1, push=33]
50
+ E 17: train=85.5% val=83.1% loss=0.6723 nce=1.00 cos=0.615 cv=0.1803(βœ“) anch=64/64 push=33 (11s) β˜…
51
+ E 18/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.71b/s, acc=86%, cos=0.609, loss=0.6584, nce=1.00, ordered=1, push=35]
52
+ E 18: train=85.9% val=80.7% loss=0.6588 nce=1.00 cos=0.618 cv=0.1910(βœ“) anch=64/64 push=35 (10s)
53
+ E 19/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.51b/s, acc=87%, cos=0.617, loss=0.6362, nce=1.00, ordered=1, push=37]
54
+ E 19: train=86.6% val=84.1% loss=0.6367 nce=1.00 cos=0.619 cv=0.1784(βœ—) anch=64/64 push=37 (11s) β˜…
55
+ E 20/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.24b/s, acc=87%, cos=0.621, loss=0.6196, nce=1.00, ordered=1, push=38]
56
+ E 20: train=87.2% val=83.6% loss=0.6198 nce=1.00 cos=0.620 cv=0.1750(βœ—) anch=64/64 push=39 (11s)
57
+ E 21/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.64b/s, acc=88%, cos=0.637, loss=0.6056, nce=1.00, ordered=1, push=40]
58
+ E 21: train=87.8% val=85.2% loss=0.6048 nce=1.00 cos=0.621 cv=0.1749(βœ—) anch=64/64 push=40 (10s) β˜…
59
+ E 22/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.65b/s, acc=88%, cos=0.617, loss=0.5867, nce=1.00, ordered=1, push=42]
60
+ E 22: train=88.2% val=84.3% loss=0.5862 nce=1.00 cos=0.622 cv=0.1712(βœ—) anch=64/64 push=42 (10s)
61
+ E 23/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.63b/s, acc=89%, cos=0.629, loss=0.5721, nce=1.00, ordered=1, push=44]
62
+ E 23: train=88.8% val=84.2% loss=0.5725 nce=1.00 cos=0.622 cv=0.1765(βœ—) anch=64/64 push=44 (10s)
63
+ E 24/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.36b/s, acc=89%, cos=0.633, loss=0.5566, nce=1.00, ordered=1, push=46]
64
+ E 24: train=89.2% val=83.2% loss=0.5570 nce=1.00 cos=0.623 cv=0.1863(βœ“) anch=64/64 push=46 (11s)
65
+ E 25/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.62b/s, acc=89%, cos=0.648, loss=0.5531, nce=1.00, ordered=1, push=48]
66
+ E 25: train=89.2% val=86.3% loss=0.5515 nce=1.00 cos=0.624 cv=0.1615(βœ—) anch=64/64 push=48 (10s) β˜…
67
+ E 26/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.60b/s, acc=90%, cos=0.633, loss=0.5364, nce=1.00, ordered=1, push=50]
68
+ E 26: train=90.0% val=85.6% loss=0.5358 nce=1.00 cos=0.623 cv=0.1764(βœ—) anch=64/64 push=50 (10s)
69
+ E 27/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.56b/s, acc=90%, cos=0.617, loss=0.5244, nce=1.00, ordered=1, push=52]
70
+ E 27: train=90.2% val=86.3% loss=0.5247 nce=1.00 cos=0.623 cv=0.1668(βœ—) anch=64/64 push=52 (11s)
71
+ E 28/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.56b/s, acc=91%, cos=0.648, loss=0.5100, nce=1.00, ordered=1, push=54]
72
+ E 28: train=90.8% val=85.6% loss=0.5098 nce=1.00 cos=0.626 cv=0.1699(βœ—) anch=64/64 push=54 (11s)
73
+ E 29/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.28b/s, acc=91%, cos=0.625, loss=0.5029, nce=1.00, ordered=1, push=56]
74
+ E 29: train=91.0% val=86.1% loss=0.5025 nce=1.00 cos=0.626 cv=0.1664(βœ—) anch=64/64 push=56 (11s)
75
+ E 30/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.76b/s, acc=92%, cos=0.621, loss=0.4870, nce=1.00, ordered=1, push=58]
76
+ E 30: train=91.6% val=85.5% loss=0.4879 nce=1.00 cos=0.625 cv=0.1603(βœ—) anch=64/64 push=58 (10s)
77
+ E 31/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.65b/s, acc=92%, cos=0.629, loss=0.4800, nce=1.00, ordered=1, push=60]
78
+ E 31: train=91.8% val=86.7% loss=0.4803 nce=1.00 cos=0.625 cv=0.1659(βœ—) anch=64/64 push=60 (10s) β˜…
79
+ E 32/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.32b/s, acc=92%, cos=0.617, loss=0.4728, nce=1.00, ordered=1, push=62]
80
+ E 32: train=92.0% val=86.0% loss=0.4727 nce=1.00 cos=0.626 cv=0.1686(βœ—) anch=64/64 push=62 (11s)
81
+ E 33/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.77b/s, acc=92%, cos=0.637, loss=0.4637, nce=1.00, ordered=1, push=64]
82
+ E 33: train=92.2% val=85.5% loss=0.4636 nce=1.00 cos=0.626 cv=0.1861(βœ“) anch=64/64 push=64 (10s)
83
+ E 34/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.28b/s, acc=92%, cos=0.645, loss=0.4606, nce=1.00, ordered=1, push=66]
84
+ E 34: train=92.3% val=87.1% loss=0.4609 nce=1.00 cos=0.628 cv=0.1733(βœ—) anch=64/64 push=66 (11s) β˜…
85
+ E 35/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.67b/s, acc=93%, cos=0.625, loss=0.4478, nce=1.00, ordered=1, push=68]
86
+ E 35: train=92.8% val=86.9% loss=0.4485 nce=1.00 cos=0.627 cv=0.1702(βœ—) anch=64/64 push=68 (10s)
87
+ E 36/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.50b/s, acc=93%, cos=0.629, loss=0.4374, nce=1.00, ordered=1, push=70]
88
+ E 36: train=93.2% val=87.3% loss=0.4370 nce=1.00 cos=0.629 cv=0.1721(βœ—) anch=64/64 push=70 (11s) β˜…
89
+ E 37/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.65b/s, acc=93%, cos=0.621, loss=0.4285, nce=1.00, ordered=1, push=72]
90
+ E 37: train=93.3% val=86.5% loss=0.4290 nce=1.00 cos=0.629 cv=0.1605(βœ—) anch=63/64 push=72 (10s)
91
+ E 38/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.67b/s, acc=94%, cos=0.613, loss=0.4240, nce=1.00, ordered=1, push=74]
92
+ E 38: train=93.6% val=87.4% loss=0.4232 nce=1.00 cos=0.627 cv=0.1755(βœ—) anch=63/64 push=74 (10s) β˜…
93
+ E 39/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.70b/s, acc=94%, cos=0.621, loss=0.4182, nce=1.00, ordered=1, push=76]
94
+ E 39: train=93.7% val=86.8% loss=0.4185 nce=1.00 cos=0.628 cv=0.1719(βœ—) anch=63/64 push=76 (10s)
95
+ E 40/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.65b/s, acc=94%, cos=0.633, loss=0.4094, nce=1.00, ordered=1, push=77]
96
+ E 40: train=94.1% val=87.6% loss=0.4097 nce=1.00 cos=0.628 cv=0.1657(βœ—) anch=64/64 push=78 (10s) β˜…
97
+ E 41/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.19b/s, acc=94%, cos=0.625, loss=0.4040, nce=1.00, ordered=1, push=79]
98
+ E 41: train=94.3% val=87.6% loss=0.4040 nce=1.00 cos=0.628 cv=0.1570(βœ—) anch=64/64 push=79 (11s) β˜…
99
+ E 42/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.65b/s, acc=94%, cos=0.648, loss=0.3969, nce=1.00, ordered=1, push=81]
100
+ E 42: train=94.4% val=86.2% loss=0.3968 nce=1.00 cos=0.628 cv=0.1591(βœ—) anch=62/64 push=81 (10s)
101
+ E 43/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.57b/s, acc=95%, cos=0.621, loss=0.3887, nce=1.00, ordered=1, push=83]
102
+ E 43: train=94.8% val=88.0% loss=0.3889 nce=1.00 cos=0.629 cv=0.1589(βœ—) anch=64/64 push=83 (11s) β˜…
103
+ E 44/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.61b/s, acc=95%, cos=0.625, loss=0.3825, nce=1.00, ordered=1, push=85]
104
+ E 44: train=94.9% val=87.9% loss=0.3823 nce=1.00 cos=0.628 cv=0.1651(βœ—) anch=62/64 push=85 (10s)
105
+ E 45/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.68b/s, acc=95%, cos=0.621, loss=0.3788, nce=1.00, ordered=1, push=87]
106
+ E 45: train=94.9% val=87.4% loss=0.3794 nce=1.00 cos=0.628 cv=0.1500(βœ—) anch=62/64 push=87 (10s)
107
+ E 46/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.76b/s, acc=95%, cos=0.633, loss=0.3709, nce=1.00, ordered=1, push=89]
108
+ E 46: train=95.2% val=86.9% loss=0.3707 nce=1.00 cos=0.629 cv=0.1664(βœ—) anch=63/64 push=89 (10s)
109
+ E 47/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.59b/s, acc=95%, cos=0.621, loss=0.3676, nce=1.00, ordered=1, push=91]
110
+ E 47: train=95.4% val=87.3% loss=0.3680 nce=1.00 cos=0.629 cv=0.1675(βœ—) anch=62/64 push=91 (10s)
111
+ E 48/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.63b/s, acc=96%, cos=0.633, loss=0.3614, nce=1.00, ordered=1, push=93]
112
+ E 48: train=95.7% val=88.1% loss=0.3612 nce=1.00 cos=0.630 cv=0.1787(βœ—) anch=61/64 push=93 (10s) β˜…
113
+ E 49/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.57b/s, acc=96%, cos=0.629, loss=0.3580, nce=1.00, ordered=1, push=95]
114
+ E 49: train=95.7% val=87.3% loss=0.3587 nce=1.00 cos=0.629 cv=0.1675(βœ—) anch=61/64 push=95 (11s)
115
+ E 50/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.45b/s, acc=96%, cos=0.629, loss=0.3546, nce=1.00, ordered=1, push=97]
116
+ E 50: train=95.8% val=87.9% loss=0.3543 nce=1.00 cos=0.630 cv=0.1666(βœ—) anch=62/64 push=97 (11s)
117
+ E 51/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.57b/s, acc=96%, cos=0.641, loss=0.3463, nce=1.00, ordered=1, push=99]
118
+ E 51: train=96.2% val=87.6% loss=0.3463 nce=1.00 cos=0.629 cv=0.1779(βœ—) anch=62/64 push=99 (10s)
119
+ E 52/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.54b/s, acc=96%, cos=0.637, loss=0.3408, nce=1.00, ordered=1, push=101]
120
+ E 52: train=96.3% val=88.5% loss=0.3415 nce=1.00 cos=0.629 cv=0.1613(βœ—) anch=62/64 push=101 (11s) β˜…
121
+ E 53/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.78b/s, acc=97%, cos=0.625, loss=0.3344, nce=1.00, ordered=1, push=103]
122
+ E 53: train=96.5% val=88.4% loss=0.3354 nce=1.00 cos=0.628 cv=0.1543(βœ—) anch=61/64 push=103 (10s)
123
+ E 54/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.61b/s, acc=97%, cos=0.621, loss=0.3326, nce=1.00, ordered=1, push=105]
124
+ E 54: train=96.6% val=86.7% loss=0.3323 nce=1.00 cos=0.629 cv=0.1658(βœ—) anch=61/64 push=105 (10s)
125
+ E 55/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.69b/s, acc=97%, cos=0.633, loss=0.3310, nce=1.00, ordered=1, push=107]
126
+ E 55: train=96.7% val=88.5% loss=0.3311 nce=1.00 cos=0.629 cv=0.1676(βœ—) anch=62/64 push=107 (10s)
127
+ E 56/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 17.96b/s, acc=97%, cos=0.629, loss=0.3264, nce=1.00, ordered=1, push=109]
128
+ E 56: train=96.8% val=88.5% loss=0.3263 nce=1.00 cos=0.628 cv=0.1533(βœ—) anch=60/64 push=109 (11s)
129
+ E 57/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.60b/s, acc=97%, cos=0.629, loss=0.3218, nce=1.00, ordered=1, push=111]
130
+ E 57: train=97.0% val=88.5% loss=0.3221 nce=1.00 cos=0.628 cv=0.1593(βœ—) anch=64/64 push=111 (10s)
131
+ E 58/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.57b/s, acc=97%, cos=0.637, loss=0.3172, nce=1.00, ordered=1, push=113]
132
+ E 58: train=97.2% val=89.0% loss=0.3174 nce=1.00 cos=0.628 cv=0.1582(βœ—) anch=64/64 push=113 (11s) β˜…
133
+ E 59/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:11<00:00, 16.91b/s, acc=97%, cos=0.633, loss=0.3149, nce=1.00, ordered=1, push=115]
134
+ E 59: train=97.1% val=88.4% loss=0.3151 nce=1.00 cos=0.628 cv=0.1496(βœ—) anch=62/64 push=115 (12s)
135
+ E 60/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:11<00:00, 17.72b/s, acc=98%, cos=0.629, loss=0.3079, nce=1.00, ordered=1, push=116]
136
+ E 60: train=97.6% val=88.5% loss=0.3079 nce=1.00 cos=0.627 cv=0.1644(βœ—) anch=60/64 push=117 (11s)
137
+ E 61/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 17.73b/s, acc=97%, cos=0.609, loss=0.3066, nce=1.00, ordered=1, push=118]
138
+ E 61: train=97.5% val=89.4% loss=0.3066 nce=1.00 cos=0.628 cv=0.1601(βœ—) anch=62/64 push=118 (11s) β˜…
139
+ E 62/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.64b/s, acc=98%, cos=0.637, loss=0.3010, nce=1.00, ordered=1, push=120]
140
+ E 62: train=97.7% val=89.2% loss=0.3014 nce=1.00 cos=0.628 cv=0.1527(βœ—) anch=60/64 push=120 (10s)
141
+ E 63/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.49b/s, acc=98%, cos=0.625, loss=0.3013, nce=1.00, ordered=1, push=122]
142
+ E 63: train=97.8% val=88.9% loss=0.3012 nce=1.00 cos=0.628 cv=0.1679(βœ—) anch=62/64 push=122 (11s)
143
+ E 64/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.46b/s, acc=98%, cos=0.625, loss=0.3013, nce=1.00, ordered=1, push=124]
144
+ E 64: train=97.7% val=88.9% loss=0.3014 nce=1.00 cos=0.627 cv=0.1576(βœ—) anch=62/64 push=124 (11s)
145
+ E 65/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.58b/s, acc=98%, cos=0.633, loss=0.2960, nce=1.00, ordered=1, push=126]
146
+ E 65: train=97.9% val=88.9% loss=0.2964 nce=1.00 cos=0.627 cv=0.1604(βœ—) anch=61/64 push=126 (10s)
147
+ E 66/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.74b/s, acc=98%, cos=0.633, loss=0.2937, nce=1.00, ordered=1, push=128]
148
+ E 66: train=98.0% val=88.9% loss=0.2941 nce=1.00 cos=0.625 cv=0.1633(βœ—) anch=61/64 push=128 (10s)
149
+ E 67/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.64b/s, acc=98%, cos=0.625, loss=0.2903, nce=1.00, ordered=1, push=130]
150
+ E 67: train=98.1% val=89.2% loss=0.2900 nce=1.00 cos=0.627 cv=0.1642(βœ—) anch=62/64 push=130 (10s)
151
+ E 68/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.57b/s, acc=98%, cos=0.621, loss=0.2857, nce=1.00, ordered=1, push=132]
152
+ E 68: train=98.2% val=89.0% loss=0.2857 nce=1.00 cos=0.626 cv=0.1784(βœ—) anch=64/64 push=132 (11s)
153
+ E 69/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:11<00:00, 17.71b/s, acc=98%, cos=0.625, loss=0.2835, nce=1.00, ordered=1, push=134]
154
+ E 69: train=98.3% val=89.6% loss=0.2836 nce=1.00 cos=0.626 cv=0.1552(βœ—) anch=62/64 push=134 (11s) β˜…
155
+ E 70/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.66b/s, acc=98%, cos=0.641, loss=0.2845, nce=1.00, ordered=1, push=136]
156
+ E 70: train=98.3% val=88.9% loss=0.2846 nce=1.00 cos=0.626 cv=0.1506(βœ—) anch=64/64 push=136 (10s)
157
+ E 71/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.67b/s, acc=98%, cos=0.613, loss=0.2826, nce=1.00, ordered=1, push=138]
158
+ E 71: train=98.4% val=88.9% loss=0.2824 nce=1.00 cos=0.625 cv=0.1662(βœ—) anch=63/64 push=138 (10s)
159
+ E 72/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.76b/s, acc=99%, cos=0.629, loss=0.2785, nce=1.00, ordered=1, push=140]
160
+ E 72: train=98.6% val=89.2% loss=0.2783 nce=1.00 cos=0.625 cv=0.1694(βœ—) anch=60/64 push=140 (10s)
161
+ E 73/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.52b/s, acc=98%, cos=0.621, loss=0.2790, nce=1.00, ordered=1, push=142]
162
+ E 73: train=98.5% val=89.2% loss=0.2790 nce=1.00 cos=0.624 cv=0.1657(βœ—) anch=61/64 push=142 (11s)
163
+ E 74/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.68b/s, acc=99%, cos=0.613, loss=0.2763, nce=1.00, ordered=1, push=144]
164
+ E 74: train=98.6% val=89.1% loss=0.2763 nce=1.00 cos=0.624 cv=0.1499(βœ—) anch=62/64 push=144 (10s)
165
+ E 75/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.74b/s, acc=99%, cos=0.617, loss=0.2734, nce=1.00, ordered=1, push=146]
166
+ E 75: train=98.7% val=89.2% loss=0.2735 nce=1.00 cos=0.623 cv=0.1513(βœ—) anch=59/64 push=146 (10s)
167
+ E 76/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.74b/s, acc=99%, cos=0.621, loss=0.2726, nce=1.00, ordered=1, push=148]
168
+ E 76: train=98.7% val=89.4% loss=0.2727 nce=1.00 cos=0.624 cv=0.1747(βœ—) anch=61/64 push=148 (10s)
169
+ E 77/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.66b/s, acc=99%, cos=0.629, loss=0.2716, nce=1.00, ordered=1, push=150]
170
+ E 77: train=98.8% val=89.5% loss=0.2715 nce=1.00 cos=0.623 cv=0.1729(βœ—) anch=61/64 push=150 (10s)
171
+ E 78/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.60b/s, acc=99%, cos=0.621, loss=0.2705, nce=1.00, ordered=1, push=152]
172
+ E 78: train=98.9% val=89.5% loss=0.2706 nce=1.00 cos=0.622 cv=0.1662(βœ—) anch=63/64 push=152 (10s)
173
+ E 79/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.52b/s, acc=99%, cos=0.617, loss=0.2695, nce=1.00, ordered=1, push=154]
174
+ E 79: train=98.8% val=89.6% loss=0.2697 nce=1.00 cos=0.620 cv=0.1543(βœ—) anch=63/64 push=154 (11s) β˜…
175
+ E 80/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.51b/s, acc=99%, cos=0.633, loss=0.2671, nce=1.00, ordered=1, push=155]
176
+ E 80: train=99.0% val=89.6% loss=0.2670 nce=1.00 cos=0.621 cv=0.1497(βœ—) anch=64/64 push=156 (11s)
177
+ E 81/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆοΏ½οΏ½β–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.49b/s, acc=99%, cos=0.621, loss=0.2659, nce=1.00, ordered=1, push=157]
178
+ E 81: train=99.0% val=89.7% loss=0.2660 nce=1.00 cos=0.620 cv=0.1483(βœ—) anch=64/64 push=157 (11s) β˜…
179
+ E 82/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.60b/s, acc=99%, cos=0.617, loss=0.2648, nce=1.00, ordered=1, push=159]
180
+ E 82: train=99.1% val=89.5% loss=0.2651 nce=1.00 cos=0.619 cv=0.1481(βœ—) anch=63/64 push=159 (10s)
181
+ E 83/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.57b/s, acc=99%, cos=0.617, loss=0.2662, nce=1.00, ordered=1, push=161]
182
+ E 83: train=99.0% val=89.7% loss=0.2663 nce=1.00 cos=0.619 cv=0.1515(βœ—) anch=63/64 push=161 (11s)
183
+ E 84/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.70b/s, acc=99%, cos=0.625, loss=0.2635, nce=1.00, ordered=1, push=163]
184
+ E 84: train=99.2% val=89.8% loss=0.2636 nce=1.00 cos=0.618 cv=0.1581(βœ—) anch=64/64 push=163 (10s) β˜…
185
+ E 85/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.60b/s, acc=99%, cos=0.621, loss=0.2636, nce=1.00, ordered=1, push=165]
186
+ E 85: train=99.2% val=89.5% loss=0.2635 nce=1.00 cos=0.617 cv=0.1610(βœ—) anch=63/64 push=165 (10s)
187
+ E 86/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.70b/s, acc=99%, cos=0.613, loss=0.2635, nce=1.00, ordered=1, push=167]
188
+ E 86: train=99.2% val=89.9% loss=0.2635 nce=1.00 cos=0.616 cv=0.1661(βœ—) anch=64/64 push=167 (10s) β˜…
189
+ E 87/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.27b/s, acc=99%, cos=0.613, loss=0.2626, nce=1.00, ordered=1, push=169]
190
+ E 87: train=99.2% val=89.7% loss=0.2628 nce=1.00 cos=0.615 cv=0.1491(βœ—) anch=64/64 push=169 (11s)
191
+ E 88/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.61b/s, acc=99%, cos=0.617, loss=0.2625, nce=1.00, ordered=1, push=171]
192
+ E 88: train=99.2% val=89.8% loss=0.2626 nce=1.00 cos=0.614 cv=0.1519(βœ—) anch=64/64 push=171 (10s)
193
+ E 89/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:11<00:00, 17.43b/s, acc=99%, cos=0.605, loss=0.2627, nce=1.00, ordered=1, push=173]
194
+ E 89: train=99.2% val=89.8% loss=0.2629 nce=1.00 cos=0.614 cv=0.1712(βœ—) anch=63/64 push=173 (11s)
195
+ E 90/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.31b/s, acc=99%, cos=0.617, loss=0.2620, nce=1.00, ordered=1, push=175]
196
+ E 90: train=99.3% val=89.6% loss=0.2620 nce=1.00 cos=0.613 cv=0.1522(βœ—) anch=63/64 push=175 (11s)
197
+ E 91/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.65b/s, acc=99%, cos=0.621, loss=0.2614, nce=1.00, ordered=1, push=177]
198
+ E 91: train=99.3% val=89.9% loss=0.2613 nce=1.00 cos=0.613 cv=0.1674(βœ—) anch=64/64 push=177 (10s) β˜…
199
+ E 92/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.55b/s, acc=99%, cos=0.609, loss=0.2616, nce=1.00, ordered=1, push=179]
200
+ E 92: train=99.3% val=90.0% loss=0.2616 nce=1.00 cos=0.611 cv=0.1468(βœ—) anch=64/64 push=179 (11s) β˜…
201
+ E 93/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.75b/s, acc=99%, cos=0.609, loss=0.2612, nce=1.00, ordered=1, push=181]
202
+ E 93: train=99.4% val=89.9% loss=0.2611 nce=1.00 cos=0.611 cv=0.1644(βœ—) anch=64/64 push=181 (10s)
203
+ E 94/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.48b/s, acc=99%, cos=0.613, loss=0.2626, nce=1.00, ordered=1, push=183]
204
+ E 94: train=99.3% val=89.8% loss=0.2624 nce=1.00 cos=0.610 cv=0.1587(βœ—) anch=64/64 push=183 (11s)
205
+ E 95/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.75b/s, acc=99%, cos=0.617, loss=0.2614, nce=1.00, ordered=1, push=185]
206
+ E 95: train=99.4% val=89.7% loss=0.2612 nce=1.00 cos=0.609 cv=0.1335(βœ—) anch=64/64 push=185 (10s)
207
+ E 96/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.59b/s, acc=99%, cos=0.594, loss=0.2617, nce=1.00, ordered=1, push=187]
208
+ E 96: train=99.4% val=89.8% loss=0.2616 nce=1.00 cos=0.609 cv=0.1648(βœ—) anch=64/64 push=187 (10s)
209
+ E 97/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.54b/s, acc=99%, cos=0.605, loss=0.2628, nce=1.00, ordered=1, push=189]
210
+ E 97: train=99.4% val=89.9% loss=0.2627 nce=1.00 cos=0.608 cv=0.1569(βœ—) anch=64/64 push=189 (11s)
211
+ E 98/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.79b/s, acc=99%, cos=0.605, loss=0.2622, nce=1.00, ordered=1, push=191]
212
+ E 98: train=99.4% val=89.8% loss=0.2620 nce=1.00 cos=0.608 cv=0.1534(βœ—) anch=64/64 push=191 (10s)
213
+ E 99/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.61b/s, acc=99%, cos=0.605, loss=0.2624, nce=1.00, ordered=1, push=193]
214
+ E 99: train=99.4% val=89.8% loss=0.2626 nce=1.00 cos=0.607 cv=0.1433(βœ—) anch=64/64 push=193 (10s)
215
+ E100/100: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 195/195 [00:10<00:00, 18.53b/s, acc=99%, cos=0.602, loss=0.2639, nce=1.00, ordered=1, push=194]
216
+ E100: train=99.3% val=89.9% loss=0.2641 nce=1.00 cos=0.607 cv=0.1606(βœ—) anch=64/64 push=195 (11s)
217
+
218
+ Best val accuracy: 90.0%
219
+ Parameters: 1,706,058
220
+
221
+ ============================================================
222
+ DONE
223
+ ============================================================