1 00:00:00,570 --> 00:00:07,440 Hi and welcome to chapter seven point one way introduce comp concept of convolutional neural nets basically 2 00:00:07,440 --> 00:00:14,590 CNN's I'll be referring to them Truthout discourse so why is needed. 3 00:00:14,620 --> 00:00:20,170 We spent a while discussing neural nets previously and maybe wondering why would we have spent so much 4 00:00:20,170 --> 00:00:26,260 time on your own nets if we're not just going to suddenly discard them and go into CNN as well understanding 5 00:00:26,260 --> 00:00:32,140 your own that's critical at understanding CNN's as they basically form the same foundation for all deeply 6 00:00:32,140 --> 00:00:37,990 wrong types of networks all of them basically require you to understand stochastic gritty and descent 7 00:00:38,710 --> 00:00:43,880 back propagation to training process botches iterations all of those things. 8 00:00:43,930 --> 00:00:50,250 Basically CNN is just a different form of neural that's And you'll find out why and how and why. 9 00:00:51,010 --> 00:00:57,400 So why CNN's because mainly because neural networks don't skill well to image data. 10 00:00:59,550 --> 00:01:05,010 Remember you know intro slides we discussed how images are stored which was basically this. 11 00:01:05,010 --> 00:01:08,000 You have a grid let's say this is a 10 by 10 grid here. 12 00:01:08,280 --> 00:01:13,170 So we have 100 inputs here technically and each input has different colors. 13 00:01:13,200 --> 00:01:18,570 So a lot of it is data and decrease your vision of this as well. 14 00:01:18,790 --> 00:01:22,220 Our job would just be this. 15 00:01:22,280 --> 00:01:25,730 So as I said neural nets don't skill well to image data. 16 00:01:25,740 --> 00:01:26,920 And why is that. 17 00:01:27,070 --> 00:01:32,660 Let's consider an image called image a small image 64 by 64 pixels. 18 00:01:32,670 --> 00:01:35,620 So how many how many inputs is that 64. 19 00:01:35,630 --> 00:01:44,230 By 64 times tree that's twelve thousand inputs already for what is a tiny image here. 20 00:01:44,410 --> 00:01:50,120 So if all input lead will have at least twelve hundred twelve thousand weights and that is large. 21 00:01:50,380 --> 00:01:52,370 And basically if we even go higher. 22 00:01:52,370 --> 00:01:58,920 We have ninety nine some thousand weights in the piddly loon and that's not even considering how many. 23 00:01:59,020 --> 00:02:02,330 How many more promises we have in the Himalayas as well. 24 00:02:02,380 --> 00:02:08,150 So we need to find basically CNN doesn't actually reduce the weights in the beginning and the end. 25 00:02:08,160 --> 00:02:14,710 But what it does do it [REMOVED] finds a representation internally in hidden layers to basically take 26 00:02:14,710 --> 00:02:22,660 advantage of how images are formed so that we can actually make a neural net much more effective on 27 00:02:22,690 --> 00:02:23,390 image data. 28 00:02:23,700 --> 00:02:25,380 Let's see how this is. 29 00:02:25,780 --> 00:02:31,740 So introducing CNN here this is effectively what it is here. 30 00:02:32,200 --> 00:02:35,570 Oh I'm going to go to each of these in detail into little slides. 31 00:02:35,590 --> 00:02:39,790 But for now conceptually So this is what it is. 32 00:02:39,820 --> 00:02:42,790 Remember in neural nets the head in Podolia and hidden layers. 33 00:02:43,030 --> 00:02:49,330 Well these are the hidden layers in a in a convolutional and that's what happens first is that we have 34 00:02:49,330 --> 00:02:54,630 what is called convolution where Realto activation unit is applied to the convolution here. 35 00:02:55,210 --> 00:03:00,280 And this convolution here it slides across image here producing values here. 36 00:03:00,490 --> 00:03:02,680 Don't worry if you don't understand this just yet. 37 00:03:02,680 --> 00:03:06,990 I'm going to go into detail in each one later on. 38 00:03:07,270 --> 00:03:10,150 So just feel free to just basically look at this. 39 00:03:10,150 --> 00:03:13,050 Get familiar with the terms and diagram how it looks. 40 00:03:13,210 --> 00:03:15,420 But you don't actually have to know what these are yet. 41 00:03:15,460 --> 00:03:18,760 So this is what convolutional neural nets look like. 42 00:03:19,850 --> 00:03:22,170 So why should you leave. 43 00:03:22,250 --> 00:03:25,000 I didn't mention truly is here before. 44 00:03:25,310 --> 00:03:30,710 But you can see here that there are depths of Leia's stacks going to sway in this way. 45 00:03:30,980 --> 00:03:34,720 Whereas New York and that's represented in north a flat diagram here. 46 00:03:35,090 --> 00:03:44,240 So is allow us to use convolutions that least image features and by living image features you'll get 47 00:03:44,260 --> 00:03:46,810 and so on what images are very soon. 48 00:03:48,230 --> 00:03:53,450 And therefore this allows us to use Folles we you know deep that look allowing for significantly faster 49 00:03:53,450 --> 00:03:55,490 training and a lot less parameters to. 50 00:03:57,440 --> 00:04:03,890 So this is an arrangement of how neural nets use truly Vol. 1 am and when I said truly volume I'm referring 51 00:04:03,890 --> 00:04:11,360 to the input to being in three dimensions here because we have height wit an adept call it up here. 52 00:04:11,430 --> 00:04:12,820 This is for color image here. 53 00:04:13,020 --> 00:04:20,050 So input go back to it here is effectively a true dimensional input that is fed into here. 54 00:04:20,390 --> 00:04:22,510 And these are all in different dimensions as well. 55 00:04:28,350 --> 00:04:32,590 So just effectively go is a as often for CNN. 56 00:04:32,760 --> 00:04:40,140 We have the input layer or convolutional there are real new layer pooling and fully connectedly are 57 00:04:40,320 --> 00:04:40,760 here. 58 00:04:41,130 --> 00:04:44,560 That's it basically they can get more complex later on. 59 00:04:44,790 --> 00:04:48,420 But these are the Corley's of CNN. 60 00:04:48,450 --> 00:04:50,350 So why is it called CNN. 61 00:04:50,430 --> 00:04:51,130 Well it's all. 62 00:04:51,310 --> 00:04:53,750 Liam convolutional live here. 63 00:04:53,940 --> 00:04:59,100 That's this one here also and this one here comes in sequences as well you can have. 64 00:04:59,310 --> 00:05:04,750 This is how we adapt to CNN by having multiple stacked convolutional layers. 65 00:05:05,070 --> 00:05:08,960 Now convolution is what allows us to actually learn image features. 66 00:05:09,300 --> 00:05:12,540 And I will explain to you again what image features are. 67 00:05:13,160 --> 00:05:19,590 But basically what I would classify uses to sort of detect what's in an image but what exactly is a 68 00:05:19,590 --> 00:05:20,450 convolution. 69 00:05:20,730 --> 00:05:23,710 Well let's find out and chopped up to 7.2.