File size: 7,143 Bytes
d157f08
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
1
00:00:00,570 --> 00:00:07,440
Hi and welcome to chapter seven point one way introduce comp concept of convolutional neural nets basically

2
00:00:07,440 --> 00:00:14,590
CNN's I'll be referring to them Truthout discourse so why is needed.

3
00:00:14,620 --> 00:00:20,170
We spent a while discussing neural nets previously and maybe wondering why would we have spent so much

4
00:00:20,170 --> 00:00:26,260
time on your own nets if we're not just going to suddenly discard them and go into CNN as well understanding

5
00:00:26,260 --> 00:00:32,140
your own that's critical at understanding CNN's as they basically form the same foundation for all deeply

6
00:00:32,140 --> 00:00:37,990
wrong types of networks all of them basically require you to understand stochastic gritty and descent

7
00:00:38,710 --> 00:00:43,880
back propagation to training process botches iterations all of those things.

8
00:00:43,930 --> 00:00:50,250
Basically CNN is just a different form of neural that's And you'll find out why and how and why.

9
00:00:51,010 --> 00:00:57,400
So why CNN's because mainly because neural networks don't skill well to image data.

10
00:00:59,550 --> 00:01:05,010
Remember you know intro slides we discussed how images are stored which was basically this.

11
00:01:05,010 --> 00:01:08,000
You have a grid let's say this is a 10 by 10 grid here.

12
00:01:08,280 --> 00:01:13,170
So we have 100 inputs here technically and each input has different colors.

13
00:01:13,200 --> 00:01:18,570
So a lot of it is data and decrease your vision of this as well.

14
00:01:18,790 --> 00:01:22,220
Our job would just be this.

15
00:01:22,280 --> 00:01:25,730
So as I said neural nets don't skill well to image data.

16
00:01:25,740 --> 00:01:26,920
And why is that.

17
00:01:27,070 --> 00:01:32,660
Let's consider an image called image a small image 64 by 64 pixels.

18
00:01:32,670 --> 00:01:35,620
So how many how many inputs is that 64.

19
00:01:35,630 --> 00:01:44,230
By 64 times tree that's twelve thousand inputs already for what is a tiny image here.

20
00:01:44,410 --> 00:01:50,120
So if all input lead will have at least twelve hundred twelve thousand weights and that is large.

21
00:01:50,380 --> 00:01:52,370
And basically if we even go higher.

22
00:01:52,370 --> 00:01:58,920
We have ninety nine some thousand weights in the piddly loon and that's not even considering how many.

23
00:01:59,020 --> 00:02:02,330
How many more promises we have in the Himalayas as well.

24
00:02:02,380 --> 00:02:08,150
So we need to find basically CNN doesn't actually reduce the weights in the beginning and the end.

25
00:02:08,160 --> 00:02:14,710
But what it does do it [REMOVED] finds a representation internally in hidden layers to basically take



26

00:02:14,710 --> 00:02:22,660

advantage of how images are formed so that we can actually make a neural net much more effective on



27

00:02:22,690 --> 00:02:23,390

image data.



28

00:02:23,700 --> 00:02:25,380

Let's see how this is.



29

00:02:25,780 --> 00:02:31,740

So introducing CNN here this is effectively what it is here.



30

00:02:32,200 --> 00:02:35,570

Oh I'm going to go to each of these in detail into little slides.



31

00:02:35,590 --> 00:02:39,790

But for now conceptually So this is what it is.



32

00:02:39,820 --> 00:02:42,790

Remember in neural nets the head in Podolia and hidden layers.



33

00:02:43,030 --> 00:02:49,330

Well these are the hidden layers in a in a convolutional and that's what happens first is that we have



34

00:02:49,330 --> 00:02:54,630

what is called convolution where Realto activation unit is applied to the convolution here.



35

00:02:55,210 --> 00:03:00,280

And this convolution here it slides across image here producing values here.



36

00:03:00,490 --> 00:03:02,680

Don't worry if you don't understand this just yet.



37

00:03:02,680 --> 00:03:06,990

I'm going to go into detail in each one later on.



38

00:03:07,270 --> 00:03:10,150

So just feel free to just basically look at this.



39

00:03:10,150 --> 00:03:13,050

Get familiar with the terms and diagram how it looks.



40

00:03:13,210 --> 00:03:15,420

But you don't actually have to know what these are yet.



41

00:03:15,460 --> 00:03:18,760

So this is what convolutional neural nets look like.



42

00:03:19,850 --> 00:03:22,170

So why should you leave.



43

00:03:22,250 --> 00:03:25,000

I didn't mention truly is here before.



44

00:03:25,310 --> 00:03:30,710

But you can see here that there are depths of Leia's stacks going to sway in this way.



45

00:03:30,980 --> 00:03:34,720

Whereas New York and that's represented in north a flat diagram here.



46

00:03:35,090 --> 00:03:44,240

So is allow us to use convolutions that least image features and by living image features you'll get



47

00:03:44,260 --> 00:03:46,810

and so on what images are very soon.



48

00:03:48,230 --> 00:03:53,450

And therefore this allows us to use Folles we you know deep that look allowing for significantly faster



49

00:03:53,450 --> 00:03:55,490

training and a lot less parameters to.



50

00:03:57,440 --> 00:04:03,890

So this is an arrangement of how neural nets use truly Vol. 1 am and when I said truly volume I'm referring



51

00:04:03,890 --> 00:04:11,360

to the input to being in three dimensions here because we have height wit an adept call it up here.



52

00:04:11,430 --> 00:04:12,820

This is for color image here.



53

00:04:13,020 --> 00:04:20,050

So input go back to it here is effectively a true dimensional input that is fed into here.



54

00:04:20,390 --> 00:04:22,510

And these are all in different dimensions as well.



55

00:04:28,350 --> 00:04:32,590

So just effectively go is a as often for CNN.



56

00:04:32,760 --> 00:04:40,140

We have the input layer or convolutional there are real new layer pooling and fully connectedly are



57

00:04:40,320 --> 00:04:40,760

here.



58

00:04:41,130 --> 00:04:44,560

That's it basically they can get more complex later on.



59

00:04:44,790 --> 00:04:48,420

But these are the Corley's of CNN.



60

00:04:48,450 --> 00:04:50,350

So why is it called CNN.



61

00:04:50,430 --> 00:04:51,130

Well it's all.



62

00:04:51,310 --> 00:04:53,750

Liam convolutional live here.



63

00:04:53,940 --> 00:04:59,100

That's this one here also and this one here comes in sequences as well you can have.



64

00:04:59,310 --> 00:05:04,750

This is how we adapt to CNN by having multiple stacked convolutional layers.



65

00:05:05,070 --> 00:05:08,960

Now convolution is what allows us to actually learn image features.



66

00:05:09,300 --> 00:05:12,540

And I will explain to you again what image features are.



67

00:05:13,160 --> 00:05:19,590

But basically what I would classify uses to sort of detect what's in an image but what exactly is a



68

00:05:19,590 --> 00:05:20,450

convolution.



69

00:05:20,730 --> 00:05:23,710

Well let's find out and chopped up to 7.2.