AI_DL_Assignment / 15. Transfer Learning Build a Flower & Monkey Breed Classifier /2. What is Transfer Learning and Fine Tuning.srt
Prince-1's picture
Add files using upload-large-folder tool
0182da2 verified
1
00:00:01,080 --> 00:00:06,390
And welcome to Chapter 15 points one way I explained to you what exactly is transfer learning and a
2
00:00:06,390 --> 00:00:07,490
fine tuning.
3
00:00:07,980 --> 00:00:13,620
So as we know from before trining complicated and deep CNN is very slow.
4
00:00:13,620 --> 00:00:21,120
Alex net and Visagie and partic. a deep parameter Laden networks Viji also and has 128 million parameters
5
00:00:21,600 --> 00:00:26,600
and resident 50 has 50 hidden layers despite being having less parameters.
6
00:00:26,620 --> 00:00:29,180
Still a lot of layers take some time to train.
7
00:00:29,700 --> 00:00:35,580
So at least that works a teen at relatively excellent performance on image at training them on you is
8
00:00:35,730 --> 00:00:37,450
definitely not recommended.
9
00:00:37,650 --> 00:00:39,950
You are not going to get anywhere close to good results.
10
00:00:39,960 --> 00:00:47,310
Even training for a moment when the C.P you Sidis CNN's are often trained for a couple of weeks or more
11
00:00:47,370 --> 00:00:49,160
using an array is of deep use.
12
00:00:49,350 --> 00:00:54,390
That's to tell you how complicated and how long it takes to get good results on image net.
13
00:00:54,480 --> 00:01:00,930
So what if there was a way we could reuse those pre-trained models and make it one classify as we've
14
00:01:00,930 --> 00:01:06,270
seen Scarus actually ships with Pretorian models on image net and those models with the models we showed
15
00:01:06,270 --> 00:01:07,780
you before in chapter 14.
16
00:01:08,070 --> 00:01:12,530
And these words are already tuned to detect dozens of low mid and high level features.
17
00:01:12,570 --> 00:01:17,450
What if we can use these that is already Treen that works now to below and classify as.
18
00:01:17,480 --> 00:01:24,780
And well we can introducing transfer learning and fine tuning this solves the problem.
19
00:01:24,810 --> 00:01:26,450
We just explained.
20
00:01:27,110 --> 00:01:29,900
So let's talk a bit about transitioning and fine tuning now.
21
00:01:30,700 --> 00:01:36,920
So fine tuning the concept of fine tuning is often and justifiably confused with transfer meaning.
22
00:01:37,140 --> 00:01:39,880
However it merely is a type of translating.
23
00:01:40,200 --> 00:01:43,140
Fine tuning is where we take a pre-trained deep CNN.
24
00:01:43,140 --> 00:01:46,730
So one of those like resonant or Viji as we've seen before.
25
00:01:47,070 --> 00:01:54,840
And we used the already trained typically on image net model to aid you image transformation tests typically
26
00:01:54,840 --> 00:02:00,450
in fine tuning We are taking an already trained CNN and turning it on to a new data set.
27
00:02:00,690 --> 00:02:01,090
OK.
28
00:02:01,350 --> 00:02:05,010
So basically what I'm seeing which I'll explain shortly is
29
00:02:08,170 --> 00:02:08,670
start of
30
00:02:12,500 --> 00:02:14,640
so let's take a look at these concepts.
31
00:02:14,660 --> 00:02:20,210
Firstly let's talk about fine tuning not the concept of fine tuning as often and very just by justifiably
32
00:02:20,210 --> 00:02:25,530
confused with transfer learning and that's because it's very similar and is merely a type of translating.
33
00:02:25,790 --> 00:02:30,500
Fine tuning is where we take a pre-trained deep CNN and we use this model.
34
00:02:30,540 --> 00:02:35,240
It's already been trained most likely on image net to basically asystole it.
35
00:02:35,330 --> 00:02:37,340
When you image classification tests.
36
00:02:37,340 --> 00:02:43,810
And typically in fine tuning We are taking an already trained CNN and turning it on when you a sense.
37
00:02:43,820 --> 00:02:50,300
So what we do is we then freeze the lower layers of this model and I'll illustrate to you what this
38
00:02:50,300 --> 00:02:51,210
means shortly.
39
00:02:51,470 --> 00:02:54,920
And we train only the top or fully connected layers.
40
00:02:55,610 --> 00:02:57,480
And that's how we actually train it.
41
00:02:57,630 --> 00:02:58,910
And you model here.
42
00:02:59,270 --> 00:03:03,600
So effectively we're just replacing the class or parts of an train model.
43
00:03:04,070 --> 00:03:08,450
And sometimes you can actually go back and unfreeze little weights and train them again to get even
44
00:03:08,450 --> 00:03:10,300
better performance.
45
00:03:10,310 --> 00:03:13,740
So let me explain this to you in the strictly what's happening.
46
00:03:13,890 --> 00:03:17,340
So imagine this is a deep much deeper than this.
47
00:03:17,510 --> 00:03:20,900
But imagine this is a deep CNN that's already been trained.
48
00:03:21,230 --> 00:03:21,960
All right.
49
00:03:22,040 --> 00:03:27,680
So when I say we freeze the layers we're freezing all the convolutional layers from the heat between
50
00:03:27,680 --> 00:03:30,750
the input and up to fully connectedly here.
51
00:03:30,980 --> 00:03:36,440
So these we imagine these we have already been trained and they're very good at picking up high low
52
00:03:36,440 --> 00:03:38,120
and mid-level features.
53
00:03:38,120 --> 00:03:45,110
So what we do know is we just basically change the classes that we want for our model and we basically
54
00:03:45,110 --> 00:03:49,390
just manipulate the top layer and Trinite on our data set now.
55
00:03:49,820 --> 00:03:56,390
So this is this is the first impulse here and this is the part we have basically modified and unfroze
56
00:03:56,510 --> 00:03:59,210
and are going to train the spot separately.
57
00:03:59,840 --> 00:04:06,770
So in fine tuning in most of CNN's THE FIRST FEW convolutional has long been low level features as explained.
58
00:04:07,080 --> 00:04:15,130
Those are like things like edge of textures of color blobs and that kind of stuff and so on.
59
00:04:15,740 --> 00:04:20,010
And as we progressed through a network it lends more high and mid-level features.
60
00:04:20,180 --> 00:04:26,750
So in fine tuning we just keep to the low levels frozen and we can also just treat the high level features
61
00:04:26,750 --> 00:04:32,800
as well so there's little steps here pretty much just went through this for you.
62
00:04:33,050 --> 00:04:40,460
But basically we freeze layers and we add or modify in a fully committed layer and we use a very tiny
63
00:04:40,460 --> 00:04:43,520
linning rate and we just initiate training again.
64
00:04:43,880 --> 00:04:49,530
It's quite easy to do in carrousel get to that code shortly and it's quite powerful.
65
00:04:49,540 --> 00:04:59,790
We by using these already well-trained models we can get superbly good accuracy on new image to us.
66
00:04:59,810 --> 00:05:05,480
So what about transitioning now as you've seen in finding we have taken an already Pretorian network
67
00:05:06,080 --> 00:05:12,420
and trained it or segments of it on some new data for a new image the best fit classification tests.
68
00:05:13,100 --> 00:05:13,940
All right.
69
00:05:13,940 --> 00:05:17,230
So translating is pretty much almost the same thing.
70
00:05:17,540 --> 00:05:22,380
And a lot of researchers and a lot of people in the industry use these terms interchangeably.
71
00:05:22,480 --> 00:05:28,700
However what transfinite linning really implies is that we're taking the knowledge from a Pretorian
72
00:05:28,700 --> 00:05:34,490
network and basically applying it to a similar tasks and therefore not really retreating much of the
73
00:05:34,490 --> 00:05:35,600
network.
74
00:05:35,600 --> 00:05:42,750
So what that means effectively is that let's go back to attack the in fine tuning.
75
00:05:42,860 --> 00:05:49,600
The reason why we call it fine tuning for is that where we can actually train these lawyers here.
76
00:05:49,920 --> 00:05:56,160
So it would transfer learning and in functioning we're basically unfreezing the top layer here and modifying
77
00:05:56,160 --> 00:05:57,660
it for our classes.
78
00:05:57,930 --> 00:06:02,090
But in fine tuning we can tend to go back and try the sleep here.
79
00:06:02,450 --> 00:06:06,630
That's to that's pretty much a core difference.
80
00:06:06,630 --> 00:06:09,000
So here's a quick quote from a deep learning book.
81
00:06:09,000 --> 00:06:13,860
I'm pretty sure you can click this link on the PDA slide that I give you basically a chance for cleaning
82
00:06:13,920 --> 00:06:19,680
and demean adaptation referred to a situation where what has already been learned or what has been learned
83
00:06:19,710 --> 00:06:25,560
in one setting is now exploited to improve generalization in another setting.
84
00:06:25,620 --> 00:06:28,600
That's effectively what transfer learning means.
85
00:06:28,800 --> 00:06:30,900
And we're going to do some practical examples now.
86
00:06:30,950 --> 00:06:35,290
We're going to use mobile net to create a monkey beach ossify.
87
00:06:35,580 --> 00:06:39,160
And then we're going to use Viji to create a flow or classify.
88
00:06:39,540 --> 00:06:42,610
So stay tuned and we are going to have some fun with these models.
89
00:06:42,640 --> 00:06:43,140
I guarantee.