File size: 18,188 Bytes
d157f08 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 | 1
00:00:00,960 --> 00:00:03,370
So welcome back to chapter eight point one.
2
00:00:03,390 --> 00:00:06,190
We introduce you to Chris and.
3
00:00:08,000 --> 00:00:09,660
So what exactly is Karris.
4
00:00:09,660 --> 00:00:15,060
I've mentioned it countless times on this course but I haven't actually told you precisely what it is.
5
00:00:15,200 --> 00:00:21,170
So careless is a high level neural network API for Python and it makes constructing neural networks
6
00:00:21,260 --> 00:00:25,370
all types not just CNNs but all types of neural networks.
7
00:00:25,490 --> 00:00:31,760
It makes it extremely easy and extremely Margiela just to Adli as Swapan stuff change different things
8
00:00:32,480 --> 00:00:36,770
configure your loss functions configure your different types of activation functions.
9
00:00:36,770 --> 00:00:38,720
It's quite nice.
10
00:00:38,870 --> 00:00:45,410
It has the ability to use of flow S.A.G. which is used for natural language processing and Tiano which
11
00:00:45,440 --> 00:00:47,180
I use use quite a bit of the data.
12
00:00:47,360 --> 00:00:51,690
However I've now moved intensively and I haven't looked back not because I it is bad.
13
00:00:51,930 --> 00:00:59,000
Just basically stopped is has stopped being updated now project has pretty much done and closed and
14
00:00:59,000 --> 00:01:02,710
everyone is probably pretty much adopting senseful now.
15
00:01:03,680 --> 00:01:09,860
And it was developed by Francois Shuli and he has been a tremendous success in making tippling much
16
00:01:09,860 --> 00:01:11,390
more accessible to the masses.
17
00:01:12,680 --> 00:01:16,820
So what is tons of little is said Chris used to tend to flow back in.
18
00:01:16,830 --> 00:01:18,400
But what exactly is a backhanded.
19
00:01:18,530 --> 00:01:19,370
OK.
20
00:01:19,620 --> 00:01:27,000
So then flow is an open source library that was created by Google Brind team by the Google brain team
21
00:01:27,030 --> 00:01:33,530
in 2015 probably was being used inside of Google internally for many years prior to 2015.
22
00:01:34,020 --> 00:01:39,750
And basically it's an extremely powerful extremely efficient and fast learning framework that is used
23
00:01:39,750 --> 00:01:46,080
for high performance numerical competition across a variety of platforms such as use GPS use and use
24
00:01:46,580 --> 00:01:52,320
it basically these engineers these guys at Google brain they developed this basically a superfast library
25
00:01:52,650 --> 00:01:57,160
similar to some PI but basically incorporated it and built it around deplaning.
26
00:01:57,180 --> 00:02:05,190
So you have all these the planning functions that are a part of the tensor flow framework to actually
27
00:02:05,190 --> 00:02:11,580
has a pite an API and a bit it pretty much is accessible and easy to use but carious is much easier
28
00:02:11,580 --> 00:02:12,200
to use.
29
00:02:13,500 --> 00:02:20,390
So why use Kerrison set of pure incivil as I said Chris is extremely easy to use as a father as a basically
30
00:02:20,390 --> 00:02:24,640
a pythonic style of coding and it is extremely modular.
31
00:02:24,860 --> 00:02:30,350
You don't even have to be a proficient program to use Chris this more elaborate.
32
00:02:30,350 --> 00:02:36,010
He means that we can actually just start doing different things that are and neural nets are CNN's can
33
00:02:36,130 --> 00:02:42,670
easily change course functions optimizes initialization schemes activation functions try different regular
34
00:02:43,030 --> 00:02:48,580
sessions screams schemes to introduce more Leia's reduced number of filters.
35
00:02:48,580 --> 00:02:53,080
All of those things are super easily inside of course.
36
00:02:53,080 --> 00:02:56,710
So it allows us to build these powerful neural nets quickly and efficiently.
37
00:02:57,130 --> 00:03:01,350
And obviously it works in partem and Python is one of my favorite.
38
00:03:01,360 --> 00:03:06,370
Actually it is my favorite part of programming language ever because it makes so many complicated things
39
00:03:06,370 --> 00:03:13,010
super easy tensor is definitely not as user friendly and what you are scarce.
40
00:03:13,090 --> 00:03:18,510
I've used a little bit to be fair but I was using Tanno at that point and Ti-Anna actually was quite
41
00:03:18,510 --> 00:03:18,980
hard.
42
00:03:19,170 --> 00:03:21,620
So dancefloor seemed easy for me at that point.
43
00:03:21,690 --> 00:03:24,870
However nothing beats carious of ease of use.
44
00:03:24,990 --> 00:03:29,880
So unless you're doing complicated models or academic research or basically looking for some sort of
45
00:03:29,880 --> 00:03:35,420
high performance startup you don't necessarily need to use pure to answer for anything.
46
00:03:35,430 --> 00:03:38,630
So now you're ready ready to see some actual crosscourt.
47
00:03:38,640 --> 00:03:40,300
Well it's actually pretty good.
48
00:03:40,620 --> 00:03:46,980
But we're using the curus library and this is how we actually construct and build models inside of Titan
49
00:03:47,580 --> 00:03:54,210
so fiercely we import the sequential model from Karris basically the sequential model is the main type
50
00:03:54,210 --> 00:04:00,000
of model you'll be building in Paris is basically all the CNNs and neural nets I've shown you before
51
00:04:00,060 --> 00:04:01,730
with sequential models.
52
00:04:01,980 --> 00:04:05,340
If you're doing something exotic then it will probably be not be sequential.
53
00:04:05,340 --> 00:04:09,480
The data is real and probably beyond the scope of this course.
54
00:04:09,480 --> 00:04:12,110
Right now it's basically academic research.
55
00:04:12,480 --> 00:04:13,130
So let's move on.
56
00:04:13,140 --> 00:04:14,750
So we defined our model here.
57
00:04:14,760 --> 00:04:23,060
We initialize it by running this line of sequential these two brackets and then now let's add some convolutional
58
00:04:23,060 --> 00:04:24,500
layers to it.
59
00:04:24,500 --> 00:04:28,150
So before we do that we have to import these layers from Carrousel.
60
00:04:28,370 --> 00:04:34,380
So from Chris Dudley as we import dense dropout flatten conv to the max beling.
61
00:04:34,400 --> 00:04:37,710
Now I don't have dropout here but it's been used in the model.
62
00:04:37,730 --> 00:04:39,220
We actually are going to code.
63
00:04:39,290 --> 00:04:40,110
So I left it in.
64
00:04:40,170 --> 00:04:42,930
So you guys know it's how it's different to here.
65
00:04:42,940 --> 00:04:43,610
OK.
66
00:04:44,090 --> 00:04:48,370
So flawlessly modeled after this is we're adding are firstly a here.
67
00:04:48,410 --> 00:04:50,550
So firstly as a convert to the.
68
00:04:50,690 --> 00:04:51,560
That's what we call it.
69
00:04:51,590 --> 00:04:56,030
And now we have open brackets here and we have some parameters to fill out.
70
00:04:56,030 --> 00:04:57,830
So let's go to these parameters.
71
00:04:57,840 --> 00:04:59,300
This one is number 22.
72
00:04:59,360 --> 00:05:05,890
Then the kernel size we can see here then type activities and we use an input shape equals in shape.
73
00:05:05,930 --> 00:05:08,360
That's a bit confusing I'll explain it to you shortly.
74
00:05:08,600 --> 00:05:14,660
So let's go through each one first one here to the two that specifies specifying First the number of
75
00:05:14,660 --> 00:05:16,030
kernels or filters.
76
00:05:16,340 --> 00:05:24,710
So in our first Leo we're using 22 filters of kernel size tree by tree with an activation function using
77
00:05:25,500 --> 00:05:31,730
an input shape is called input shape and that's because previously above this could we declared we created
78
00:05:32,370 --> 00:05:38,400
shape parameter variable I should say that was twenty eight by 28 by 1.
79
00:05:38,720 --> 00:05:40,240
So in what shape we use.
80
00:05:40,550 --> 00:05:45,890
So I just left it out here for convenience because it's we tend to leave it out and does have a defined
81
00:05:45,980 --> 00:05:49,440
outside of the scope of this declaration here.
82
00:05:49,910 --> 00:05:51,230
Now to add another layer.
83
00:05:51,590 --> 00:05:52,990
It's as simple as modeled.
84
00:05:53,080 --> 00:05:55,280
And Chris is so easy to use.
85
00:05:55,280 --> 00:06:00,290
Just keep it using one add and it stacks easily at the top of each other starting with the first to
86
00:06:00,290 --> 00:06:01,650
the bottom layers.
87
00:06:01,670 --> 00:06:05,450
So now we add another convolutional Liya sixty four to the system.
88
00:06:05,490 --> 00:06:08,610
See him can on SES and not we don't need to specify size.
89
00:06:08,620 --> 00:06:12,660
Everytime we do it it knows that a second parameter is kernel size.
90
00:06:12,690 --> 00:06:13,770
So it's tree by tree.
91
00:06:13,790 --> 00:06:15,760
And again it Activision really.
92
00:06:16,280 --> 00:06:19,570
And now you've noticed we don't need to specify an input shape here.
93
00:06:19,910 --> 00:06:20,690
And do you know why.
94
00:06:20,780 --> 00:06:23,840
That's because this leads directly connected to display.
95
00:06:23,930 --> 00:06:27,380
So it ticks the output of this layer and it knows the output.
96
00:06:27,380 --> 00:06:31,290
So the output of this layer is effectively the input into this layer.
97
00:06:31,670 --> 00:06:34,150
So we no longer have to declare inputs anymore.
98
00:06:35,740 --> 00:06:37,710
So now you can add a max blink.
99
00:06:38,000 --> 00:06:40,820
And I said before we are going to use Emacs beling of two by two.
100
00:06:41,140 --> 00:06:42,170
So that's simple here.
101
00:06:42,170 --> 00:06:49,340
We have modeled that Max spooling D and specify the pool size open brackets to by to close brackets.
102
00:06:49,340 --> 00:06:53,720
For this here close brackets for this here and then we do a Flaten.
103
00:06:53,830 --> 00:06:56,220
I haven't discussed flattened flattens nationally.
104
00:06:56,220 --> 00:06:57,300
It's basically a function.
105
00:06:57,300 --> 00:07:01,350
We do that we use to feed the densely or fully connectedly.
106
00:07:01,710 --> 00:07:05,540
You'll see it visually in the next diagram on the following slide.
107
00:07:05,740 --> 00:07:11,940
Basically we then add a densely here with 128 units all the activated by real.
108
00:07:12,300 --> 00:07:16,760
And this is connected to another densely here which outputs number of classes.
109
00:07:16,830 --> 00:07:23,510
So classes in this dataset here were 10 because we're using the amnesty to set 0 to 9 and we use it
110
00:07:23,510 --> 00:07:30,360
for soft Maxtor activation to get the basically the probabilities that I hope you think this is quite
111
00:07:30,360 --> 00:07:32,510
simple because to me this is quite basic.
112
00:07:32,610 --> 00:07:37,190
You may take a while to get familiar with how these things work but you will get used to it in this
113
00:07:37,200 --> 00:07:38,790
course I guarantee it.
114
00:07:38,790 --> 00:07:40,800
So let's take a look at what we've built here.
115
00:07:41,230 --> 00:07:41,540
OK.
116
00:07:41,580 --> 00:07:44,880
So this is actually what we have built so far.
117
00:07:44,880 --> 00:07:50,090
So we have an input image here 20 by 20 and by 1 1 because it's greyscale.
118
00:07:50,190 --> 00:07:54,440
If it was a color image R.G. image it would be tree depth here.
119
00:07:54,780 --> 00:08:02,250
So as you saw before we have 22 filters here connected to another conflict with 64 filters here.
120
00:08:02,490 --> 00:08:08,850
And you may have noticed the size of the image shrink here or Schrank here it became 26 by 26 and in
121
00:08:08,850 --> 00:08:10,490
24 by 24.
122
00:08:10,590 --> 00:08:12,770
And that's because we didn't use any zero padding.
123
00:08:12,840 --> 00:08:15,980
And I'll show you guys later on in court how to do your reporting.
124
00:08:15,990 --> 00:08:16,910
It's quite simple.
125
00:08:17,100 --> 00:08:22,450
But for now just remember when you don't use your reporting or input image or convolutional feature
126
00:08:22,450 --> 00:08:25,530
map size reduces from the input image size.
127
00:08:25,530 --> 00:08:28,950
So we have these two Mattew currently stacked here.
128
00:08:28,950 --> 00:08:33,000
Then we have OMX pooling which basically shrinks this by half.
129
00:08:33,180 --> 00:08:34,360
I have a Drapeau thing here.
130
00:08:34,500 --> 00:08:39,330
But we didn't actually use dropout before in the code but in the actual code when we started a project
131
00:08:39,330 --> 00:08:44,720
I'll show you quickly how to actually implement dropout in one line super easy.
132
00:08:44,820 --> 00:08:48,260
What I wanted to show you do was we have to flatten a here.
133
00:08:48,490 --> 00:08:52,890
Nada Flannelly if you go back to here we just flatten brackets.
134
00:08:52,950 --> 00:08:54,420
And what does it actually do.
135
00:08:54,450 --> 00:09:00,960
Flatten basically takes this treatise multidimensional matrix 64 by 12 by 12 and basically turns it
136
00:09:00,960 --> 00:09:07,430
into a roll of twelve hundred ninety nine thousand two hundred and sixteen columns.
137
00:09:07,470 --> 00:09:11,770
So what it means here is that we just we just flattened this matrix.
138
00:09:11,820 --> 00:09:18,540
So instead of having told by Jove I-64 imagine you just built an entire long row where it's the first
139
00:09:18,540 --> 00:09:24,410
12 here then second 12 and then just consecutive long basically a long row.
140
00:09:24,870 --> 00:09:28,810
And that becomes does this put box here.
141
00:09:29,100 --> 00:09:35,180
And now this up at Box here is fed into the fully connectedly here with 128 nodes.
142
00:09:35,430 --> 00:09:36,440
Asked about it here.
143
00:09:36,690 --> 00:09:37,690
So we did find it here.
144
00:09:37,710 --> 00:09:40,760
Each node was connected to a real two activation units.
145
00:09:41,220 --> 00:09:48,810
And then finally we connect this to our final Denslow with soft Max activation function and outputs
146
00:09:48,810 --> 00:09:54,210
to 10 nodes 10 nodes because our amnestied a class dataset has 10 classes.
147
00:09:54,300 --> 00:09:56,290
So that's how we get our probabilities here.
148
00:09:56,700 --> 00:09:59,130
So it's not illustrated here but later on you will see it.
149
00:09:59,130 --> 00:10:03,140
So now that we've built our model we are ready to compile this model.
150
00:10:03,390 --> 00:10:08,640
So compiling simply creates an object that stores our model we've just created and we can specify all
151
00:10:08,640 --> 00:10:13,350
loss algorithm optimizer define our performance metric that we want to look at.
152
00:10:13,440 --> 00:10:18,090
And additionally we can specify parameters for the optimizer such as litigates and momentum.
153
00:10:18,090 --> 00:10:22,630
So this is a simple model that compile code here.
154
00:10:22,980 --> 00:10:29,610
We have our categorical cross entropy definers and last type we have optimized SAGD being stochastic
155
00:10:29,610 --> 00:10:33,990
really innocent and we have a metric look at being defined as accuracy.
156
00:10:36,370 --> 00:10:39,000
So how do we fit in our model now.
157
00:10:39,010 --> 00:10:44,770
So following a simple basically what Eski line which is the most established and popular machine learning
158
00:10:44,770 --> 00:10:48,540
library on pite and private through senseful in Paris.
159
00:10:48,610 --> 00:10:52,350
Basically they did as applied model but that would be used to training data.
160
00:10:52,600 --> 00:10:56,130
The training labels extreme and waitron that's what they are.
161
00:10:56,140 --> 00:11:02,360
You'll see them in a code soon specified number of epochs and about say not about size.
162
00:11:02,560 --> 00:11:05,720
This doesn't impact learning that much significantly.
163
00:11:05,860 --> 00:11:11,470
How ever you basically should use a largest bedside size it's possible that your memory allows.
164
00:11:11,500 --> 00:11:17,420
So you can experiment and try it you'll know it if you try it too large a box that size your kernel
165
00:11:17,440 --> 00:11:18,250
will crash.
166
00:11:18,250 --> 00:11:22,640
So generally I tend to not avoid having Why couldn't I put it in a book Crash.
167
00:11:22,750 --> 00:11:29,030
So all is basically about size of 32 for pretty much smaller images or 16 for larger images.
168
00:11:31,450 --> 00:11:35,250
And once we do that we can now evaluates and generate predictions afterward.
169
00:11:35,290 --> 00:11:41,830
So by running a model that evaluates and feeding it X test and whitest labels with a bat size we can
170
00:11:41,830 --> 00:11:47,140
get these metrics the metrics parameters metrics object and then we can use that to actually look at
171
00:11:47,140 --> 00:11:53,580
different graphs and if an interesting performance information from our model and if we wanted to ever
172
00:11:53,590 --> 00:11:59,410
predict an individual point like we have an image and you want to get the actual class it belongs to
173
00:12:00,040 --> 00:12:04,070
you can use model to predict model to predict allows us to feed one image at a time.
174
00:12:04,130 --> 00:12:07,070
We can feed the entire dataset here as well.
175
00:12:07,840 --> 00:12:09,660
So that's it's a let's get starting.
176
00:12:09,690 --> 00:12:13,030
So let's build our own handwritten digit classify.
|