1 00:00:00,490 --> 00:00:06,120 Hi and welcome to Chapter 15 point to where we're going to build a monkey bridge justifier and basically 2 00:00:06,120 --> 00:00:11,780 use the concept of transfer learning to get very high accuracy very quickly. 3 00:00:11,780 --> 00:00:13,550 So let's take a look at this dataset. 4 00:00:13,560 --> 00:00:19,380 This is where it was taken a stick and from a Kaggle project and basically it has about 80 images I 5 00:00:19,380 --> 00:00:22,240 think of about 10 different types of monkeys each. 6 00:00:22,270 --> 00:00:24,290 Is it a species of monkeys here. 7 00:00:25,140 --> 00:00:30,120 And actually not 80 18:00 into the 152 images of each class. 8 00:00:30,120 --> 00:00:35,420 And these are some sample images here and you'll notice that some are quite small and differently. 9 00:00:35,670 --> 00:00:40,740 Different aspect ratios images of various sizes and quality as well. 10 00:00:40,770 --> 00:00:45,900 So it's pretty much like what you might build as your own data sets effectively. 11 00:00:46,050 --> 00:00:53,310 It's not well standardized not super neat not super high quality images just random images taken from 12 00:00:53,310 --> 00:00:54,080 the Internet. 13 00:00:54,360 --> 00:01:01,280 So now let's move onto Pitre notebook and begin creating this toxify OK. 14 00:01:01,290 --> 00:01:07,830 So before we begin I hope you downloaded your resource file monkey Breede datasets and have placed it 15 00:01:07,950 --> 00:01:09,560 inside of the territory here. 16 00:01:09,810 --> 00:01:15,660 This is the translating directory so you have monkey abused monkey read our victory here with our training 17 00:01:15,690 --> 00:01:16,300 images. 18 00:01:16,320 --> 00:01:17,960 And each one is in full to here. 19 00:01:18,300 --> 00:01:22,980 And let's go back and now hopefully that's set up correctly for you. 20 00:01:23,250 --> 00:01:28,070 So now we can go back so we went back to full and open up here. 21 00:01:28,200 --> 00:01:33,630 I already have it open right now so I'm going to go through this step by step so you understand exactly 22 00:01:33,630 --> 00:01:36,210 how we can apply transfer linning. 23 00:01:36,210 --> 00:01:37,140 All right. 24 00:01:37,140 --> 00:01:39,180 So we're doing this with more on that. 25 00:01:39,210 --> 00:01:44,090 And the reason I have to move on that for is because it actually trains quite quickly on C-p use. 26 00:01:44,430 --> 00:01:46,430 So let's import ballots here. 27 00:01:47,010 --> 00:01:53,880 And then let's define image rows and columns so we're going to use uniform square images of 2:24 by 28 00:01:53,880 --> 00:01:55,650 29 2:24 in size. 30 00:01:55,740 --> 00:01:58,620 And this is how we basically define that. 31 00:01:58,650 --> 00:02:01,280 When we loaded in we wanted to we had to be Image nets. 32 00:02:01,290 --> 00:02:03,970 We've seen this before in our pretreated Model S.. 33 00:02:04,020 --> 00:02:06,240 However we haven't seen these parameters here. 34 00:02:06,240 --> 00:02:11,610 I will quickly discuss this with you what we're going to do is that we're going to include the top and 35 00:02:11,610 --> 00:02:13,120 said this at Falls. 36 00:02:13,380 --> 00:02:18,150 What this means is that the fully connectedly as the last layers on the top of the model are basically 37 00:02:18,150 --> 00:02:19,820 not included in the model. 38 00:02:20,130 --> 00:02:24,650 So I'm going to show you what it looks pretty soon and in what shape is of center thing. 39 00:02:24,690 --> 00:02:30,420 We just defined in what shape of this model to be this as why we define these parameters up here and 40 00:02:30,420 --> 00:02:33,050 tree means color depth of tree RGV. 41 00:02:33,450 --> 00:02:36,990 So this is a cool thing we can do with terrorist models upload. 42 00:02:37,320 --> 00:02:39,790 So we have a model here called Mobile that. 43 00:02:39,890 --> 00:02:44,310 So by addressing the layers within that Dudley as districting an array. 44 00:02:44,490 --> 00:02:50,450 And we can basically loop through these areas here and actually turn it off manually. 45 00:02:50,690 --> 00:02:57,360 The treatable parameter a flag that controls what it is should be trainable or not. 46 00:02:57,360 --> 00:03:02,980 So what we do in this two lines of code here is that we basically setting all the layers in Mobile and 47 00:03:02,990 --> 00:03:06,450 that's to be non-tradable basically fixed. 48 00:03:06,450 --> 00:03:09,160 This is how we freeze DeWitt's right here. 49 00:03:09,690 --> 00:03:11,830 So now we could actually print these layers here. 50 00:03:12,120 --> 00:03:16,250 And basically what we are printing is the name of Leo number. 51 00:03:16,320 --> 00:03:22,740 I go to the loop and we're going to print the flag Liautaud trainable what it's treatable. 52 00:03:22,770 --> 00:03:23,940 True or false. 53 00:03:23,970 --> 00:03:29,970 So you get to see all the layers now which is quite a bit a mobile that are set to false. 54 00:03:29,970 --> 00:03:31,840 So this is pretty awesome already. 55 00:03:32,100 --> 00:03:35,290 So I hope you're following simple code so far. 56 00:03:36,000 --> 00:03:42,090 So now we're going to do is we're going to create a simple function here that basically adds to fully 57 00:03:42,090 --> 00:03:47,730 connected head back onto the model we loaded here because remember we loaded it. 58 00:03:47,880 --> 00:03:49,180 But we didn't get to the top. 59 00:03:49,200 --> 00:03:52,950 So now we have a model without any top. 60 00:03:52,950 --> 00:03:55,070 So no actually I want to show you something quickly. 61 00:03:55,290 --> 00:03:57,590 What if we said this to true. 62 00:03:57,620 --> 00:03:58,090 All right. 63 00:03:58,110 --> 00:03:59,220 How would this model look. 64 00:03:59,250 --> 00:04:04,330 So we saw we had 86 differently as the last one being removed. 65 00:04:04,650 --> 00:04:09,000 So let's now print this and see what it looks 66 00:04:14,460 --> 00:04:16,170 takes about five to 10 seconds to run. 67 00:04:16,170 --> 00:04:17,500 There we go. 68 00:04:18,330 --> 00:04:18,990 Oh good. 69 00:04:18,990 --> 00:04:24,770 So before we head up to 86 now we see we have basically this is the top fully connected head. 70 00:04:25,200 --> 00:04:28,240 This is what we left out before previously. 71 00:04:28,320 --> 00:04:29,510 So now let's put it back in. 72 00:04:29,670 --> 00:04:30,690 OK. 73 00:04:32,350 --> 00:04:35,090 Because what we're going to do we're going to add a head here. 74 00:04:35,340 --> 00:04:38,660 These are really as we we are going to add onto the model now. 75 00:04:38,710 --> 00:04:40,660 So how do we use this function. 76 00:04:40,660 --> 00:04:43,720 This function takes a number of classes. 77 00:04:43,790 --> 00:04:46,120 I do our data sets. 78 00:04:46,420 --> 00:04:48,220 We specify how many classes we want. 79 00:04:48,220 --> 00:04:54,370 So for a monkey breed they is that it's going to be 10 and the bottom bottom model is basically this 80 00:04:54,420 --> 00:04:55,040 model here. 81 00:04:55,080 --> 00:04:57,250 Well not at all it's for us and it's. 82 00:04:57,580 --> 00:05:00,040 So let's quickly see what this function does. 83 00:05:00,100 --> 00:05:02,200 It takes a lot of muddled model here. 84 00:05:02,420 --> 00:05:07,310 Guess's gets the output part of it here and we create basically the top model now. 85 00:05:07,660 --> 00:05:13,990 So what we do now we have to find a top model like this here and now the top model we just simply basically 86 00:05:14,080 --> 00:05:15,450 add these layers here. 87 00:05:15,670 --> 00:05:18,240 It's a different way of Ardingly as no one cares. 88 00:05:18,580 --> 00:05:21,010 So we added an adjusted to the top model here. 89 00:05:21,280 --> 00:05:28,600 So for us we do global pooling Tuti we do a densely with a thousand and 28 nodes that again another 90 00:05:28,600 --> 00:05:33,590 densely a here and then we do a final densely with soft Macksville attend classes we want. 91 00:05:33,790 --> 00:05:38,490 And then what this does retune the model Top Model back. 92 00:05:38,600 --> 00:05:45,640 OK so now what we do below is obviously we just load all Olias of need and defined number of classes 93 00:05:45,670 --> 00:05:51,640 but now we can actually use our function here where we actually enter a number of classes. 94 00:05:51,730 --> 00:05:57,480 We enter the mobile in that model that we created we loaded before and we add a top. 95 00:05:57,580 --> 00:06:02,000 That's actually a we defined here to this model and that's why we call it the C. 96 00:06:02,360 --> 00:06:08,840 And what we do know is that we use this cross model function so we use it now to get inputs here which 97 00:06:08,840 --> 00:06:13,680 are defined as a mobile at model output speed the other possible are we going to train. 98 00:06:13,840 --> 00:06:18,970 And basically this combines it into one model now one model where it looks like this when print printed 99 00:06:18,970 --> 00:06:19,920 out. 100 00:06:20,520 --> 00:06:26,300 So a lot of layers I just saw before 86 Lia's But now we have four sort six Malis is now these are three 101 00:06:26,320 --> 00:06:27,360 to find here. 102 00:06:27,790 --> 00:06:29,250 And that's going to show up right here. 103 00:06:29,320 --> 00:06:30,590 So this is pretty cool. 104 00:06:30,820 --> 00:06:31,950 And look at this here. 105 00:06:31,960 --> 00:06:37,710 So we have five million parameters five point equally actually and trainable parameters. 106 00:06:37,750 --> 00:06:38,870 Only 2.6. 107 00:06:38,890 --> 00:06:43,430 And the non-tradable parameters which are the width of we froze our trillion. 108 00:06:43,720 --> 00:06:48,850 So effectively we've taken a model of at MIT How was pretty complex not super complex like a Viji and 109 00:06:48,880 --> 00:06:54,850 couple of others but complex enough and we've made it into a much simpler model to train. 110 00:06:55,030 --> 00:06:57,390 So let's get to training of monkey breed. 111 00:06:57,400 --> 00:07:00,600 They just had no training on Lockerby to classify. 112 00:07:00,910 --> 00:07:05,720 So we loaded data sets using imaged digit data generators that you've seen before. 113 00:07:06,460 --> 00:07:12,410 We do a standard thing here which you of which you should be pretty familiar with by now and then we 114 00:07:12,400 --> 00:07:14,530 define some checkpoints and Colback sorry. 115 00:07:14,650 --> 00:07:20,470 So we use stopping and checkpointing here and then we train for only five bucks for now. 116 00:07:20,680 --> 00:07:23,440 That's because we don't want it to be like take too long. 117 00:07:23,830 --> 00:07:26,380 And actually treating it separately in this window here. 118 00:07:26,830 --> 00:07:30,980 So I've actually already trained almost five e-books and realize so much time. 119 00:07:31,390 --> 00:07:38,590 So look at this here you can see after this epoch which took just under five minutes our validation 120 00:07:38,590 --> 00:07:41,180 accuracy was 88 percent already. 121 00:07:41,470 --> 00:07:44,750 That is actually pretty damn good for such a short space of time. 122 00:07:45,070 --> 00:07:49,580 Now and a second night duration because it's such a early start to the trading hour. 123 00:07:49,630 --> 00:07:55,690 Even though the trading loss is much lower the and accuracy is a little bit less 84 percent. 124 00:07:55,780 --> 00:07:56,320 That's OK. 125 00:07:56,350 --> 00:07:58,060 We can sort of live it out. 126 00:07:58,120 --> 00:08:03,280 We'll let a train from what ebox and see how it evolves because training these pre-treat models when 127 00:08:03,280 --> 00:08:09,120 it's something which is frozen is a little bit different than how we turn to CNN's they basically do 128 00:08:09,180 --> 00:08:12,890 they do effectively converge and get a very high value. 129 00:08:12,940 --> 00:08:15,920 However you do sometimes see some odd fluctuations like this. 130 00:08:16,270 --> 00:08:20,320 And look we have it back up to 91 percent 90 percent. 131 00:08:20,320 --> 00:08:26,110 If you wait a few minutes sorry about 20 seconds at least here we can actually see what evaluation accuracy 132 00:08:26,110 --> 00:08:28,540 is at the end of the fifth book. 133 00:08:28,540 --> 00:08:39,640 So let's wait and see what it looks. 134 00:08:39,650 --> 00:08:46,360 One thing to note is that you can actually see our callback stopping callback actually telling us how 135 00:08:46,370 --> 00:08:52,340 validation loss did not improve did not improve if we left this for 20 bucks and we had actually it 136 00:08:52,340 --> 00:08:57,060 was here as well so basically no matter what this is going to be the last epoch because I'm pretty sure 137 00:08:57,060 --> 00:09:00,070 I said my patience to Tree Hill. 138 00:09:00,320 --> 00:09:00,770 Yep. 139 00:09:00,770 --> 00:09:02,360 I usually always do. 140 00:09:02,840 --> 00:09:07,250 So right now what it's doing had a reason why I stuck it two seconds even did two seconds would have 141 00:09:07,250 --> 00:09:13,220 passed by the time I started the sentence is that it's predicting on treating all validation data set. 142 00:09:13,220 --> 00:09:16,540 Now that's something that a lot of beginners don't know. 143 00:09:16,720 --> 00:09:19,620 Take the seat pause at the end of it like a note stuck. 144 00:09:19,760 --> 00:09:20,770 It isn't actually stuck. 145 00:09:20,780 --> 00:09:24,270 It's just waiting to run on the validation data set now. 146 00:09:24,380 --> 00:09:29,480 So it takes a little while to honestly because sometimes validation data sets are quite big. 147 00:09:29,800 --> 00:09:31,110 Ah there we go. 148 00:09:31,140 --> 00:09:32,240 So look at this. 149 00:09:32,410 --> 00:09:37,120 We got 93 percent accuracy in such a short space of time. 150 00:09:37,190 --> 00:09:38,300 So this is quite good. 151 00:09:38,360 --> 00:09:41,050 So no it's actually go back to this main page here. 152 00:09:41,450 --> 00:09:44,290 Let's look at our model which takes me about 10 seconds. 153 00:09:47,460 --> 00:09:52,570 And what are we going to do once this model is loaded We're going to basically use open C-v because 154 00:09:52,810 --> 00:09:53,310 messy. 155 00:09:53,340 --> 00:09:59,970 But of course I wrote quickly that loads the images here and it runs into predictive that we just loaded 156 00:10:00,180 --> 00:10:07,980 here and so on already and we're actually going to see the monkey class see how accurate a real ossify 157 00:10:07,980 --> 00:10:10,410 really is 90 percent accurate. 158 00:10:10,410 --> 00:10:12,280 So let's find out. 159 00:10:12,480 --> 00:10:13,390 There we go. 160 00:10:13,800 --> 00:10:14,560 So this is the truth. 161 00:10:14,560 --> 00:10:16,720 US battled. 162 00:10:17,080 --> 00:10:18,710 Yes that's like a Japanese monkey. 163 00:10:20,080 --> 00:10:22,850 OK so fiercely it got this one wrong. 164 00:10:23,120 --> 00:10:28,310 This is what Elmo predicted Whitehead had a cabbage in and no it was not a white hat. 165 00:10:28,810 --> 00:10:30,520 Let's see if he gets it right. 166 00:10:30,520 --> 00:10:31,300 Yeah it did. 167 00:10:31,330 --> 00:10:32,630 Got this one right. 168 00:10:32,710 --> 00:10:33,110 Pick me. 169 00:10:33,110 --> 00:10:34,020 I'm almost at. 170 00:10:34,270 --> 00:10:37,010 Let's see what the other is. 171 00:10:37,070 --> 00:10:39,000 Gary langar definitely. 172 00:10:39,020 --> 00:10:40,020 Right. 173 00:10:40,280 --> 00:10:41,590 Pygmy marmosets again. 174 00:10:41,660 --> 00:10:42,710 Got it right. 175 00:10:42,740 --> 00:10:44,090 Got it right. 176 00:10:44,090 --> 00:10:44,990 Got that right. 177 00:10:44,990 --> 00:10:46,210 Got it right. 178 00:10:46,550 --> 00:10:48,010 Got it right again. 179 00:10:48,560 --> 00:10:49,930 Got it right. 180 00:10:50,000 --> 00:10:51,350 So seems pretty good. 181 00:10:51,530 --> 00:10:55,230 So aside from the first one model got basically nine out of 10 right. 182 00:10:55,250 --> 00:10:58,550 Which kind of corresponds to 90 percent accuracy. 183 00:10:58,550 --> 00:10:59,560 We got here. 184 00:10:59,930 --> 00:11:07,100 So you've just learnt to create a model a basically a train model using transfer learning and you see 185 00:11:07,100 --> 00:11:07,850 how simple it is. 186 00:11:07,850 --> 00:11:15,770 You just basically linnets load it with the weight speed frozen and the top being not included. 187 00:11:15,770 --> 00:11:18,800 Then you build the function to add the top whatever top you want to add. 188 00:11:18,860 --> 00:11:24,770 I didn't hear all these make sure the Lasley is number of classes you have in your dataset. 189 00:11:24,860 --> 00:11:28,540 Then you basically compare concatenates and compile the bottles here. 190 00:11:29,690 --> 00:11:32,890 Well combinable I should say you do it your image under it. 191 00:11:32,900 --> 00:11:38,980 It does define your check points and callbacks compile and we go and train. 192 00:11:39,400 --> 00:11:42,880 So it's really very simple and I hope you find a disruptive quite useful. 193 00:11:43,060 --> 00:11:43,340 Thank you.