1 00:00:00,510 --> 00:00:00,940 Hey. 2 00:00:00,960 --> 00:00:05,520 And welcome back to chapter eleven point one where we go into the confusion matrix and then calculate 3 00:00:05,520 --> 00:00:07,180 precision and recall. 4 00:00:07,190 --> 00:00:08,530 So let's get started. 5 00:00:08,880 --> 00:00:15,480 So before I dive into Python that book and we actually would you an example using sikat confusion matrix 6 00:00:15,480 --> 00:00:16,240 function. 7 00:00:16,530 --> 00:00:19,310 Let's take a look at what the confusion matrix actually looks like. 8 00:00:19,350 --> 00:00:22,700 Now typically which I shown you before. 9 00:00:22,980 --> 00:00:29,490 You have basically true true positives true negatives false positives and false negatives. 10 00:00:29,610 --> 00:00:31,510 And that was in a binary class analysis. 11 00:00:31,550 --> 00:00:37,400 Now I'm going to go a step further and make you understand this concept using a multiclass problem and 12 00:00:37,400 --> 00:00:41,010 we're using actual real world results for amnesty the set. 13 00:00:41,250 --> 00:00:45,410 So this big worry kind of looks like this big matrix looks a bit strange. 14 00:00:45,450 --> 00:00:48,720 However you do pick up initially that is a pattern right here. 15 00:00:48,900 --> 00:00:50,170 That is a diagonal way. 16 00:00:50,190 --> 00:00:53,470 These are very large numbers and then there are small numbers on the outskirts. 17 00:00:53,760 --> 00:00:55,740 What do these numbers actually mean. 18 00:00:56,250 --> 00:00:57,770 So let's take a look now. 19 00:00:58,230 --> 00:00:59,110 I've made it simple. 20 00:00:59,130 --> 00:01:00,820 We know we're looking at amnesty to set. 21 00:01:00,840 --> 00:01:04,210 So we have 10 classes that's 0 2 9 9. 22 00:01:04,290 --> 00:01:05,100 Likewise here. 23 00:01:05,190 --> 00:01:08,320 And what this call them here is the predicted value. 24 00:01:08,340 --> 00:01:15,230 So these these numbers here mean that classify a predicted zero and true value was actually zero. 25 00:01:15,240 --> 00:01:17,180 So that's why it is a big number here. 26 00:01:17,440 --> 00:01:20,170 So to one shoot value was actually 1. 27 00:01:20,370 --> 00:01:23,810 So having large numbers in this diagonal is a good thing. 28 00:01:24,000 --> 00:01:29,550 Having large numbers outside of his batting so we can see we have some large numbers here. 29 00:01:29,550 --> 00:01:30,360 We have an 11. 30 00:01:30,360 --> 00:01:32,940 We have six to five here. 31 00:01:33,220 --> 00:01:36,950 So now let's take a look at these numbers so I've highlighted them here. 32 00:01:37,220 --> 00:01:38,790 So what do they actually mean. 33 00:01:38,790 --> 00:01:44,350 It means that our classify predicted to when it was actually 7. 34 00:01:44,460 --> 00:01:49,920 So classify as confusing systems with two is it seeing a seven but it's classifying it as a two which 35 00:01:49,920 --> 00:01:50,660 is wrong. 36 00:01:50,940 --> 00:01:56,070 And likewise for sixes and zeroes you know nines and fours here. 37 00:01:56,970 --> 00:01:58,680 Sorry this one here. 38 00:01:59,120 --> 00:02:00,990 And it's as well. 39 00:02:00,990 --> 00:02:08,550 So what I mean is that let's look at nines and fours classify a predicted four but five times it was 40 00:02:08,550 --> 00:02:10,160 actually a 9. 41 00:02:10,260 --> 00:02:11,660 So we can see the biggest problem. 42 00:02:11,670 --> 00:02:17,400 Oh emulous classify as facing is confusing to his and sevens which is actually a real problem for a 43 00:02:17,400 --> 00:02:18,630 lot of humans. 44 00:02:18,660 --> 00:02:20,070 My handwriting isn't very good. 45 00:02:20,070 --> 00:02:21,690 I'll be the first one to admit that. 46 00:02:21,810 --> 00:02:25,700 And lots of times I'm looking at numbers I write and I'm like Is that a 2. 47 00:02:25,700 --> 00:02:26,550 Was it a 7. 48 00:02:26,550 --> 00:02:34,050 So we can see all calcify sort of leaning generally like a human with lean and interpreting all misinterpreting 49 00:02:34,080 --> 00:02:36,840 results just like a human like ourselves would. 50 00:02:37,080 --> 00:02:41,860 So let's actually work out or recall value based on this real world data. 51 00:02:42,210 --> 00:02:43,770 So let's take a look at the number seven. 52 00:02:43,860 --> 00:02:44,350 All right. 53 00:02:44,550 --> 00:02:47,040 So this is the true classes for number 7 here. 54 00:02:47,040 --> 00:02:49,220 So we saw a classified got it right. 55 00:02:49,410 --> 00:02:52,800 One thousand and ten times that's true positives here. 56 00:02:53,130 --> 00:02:55,020 So how do we get the number of false negatives. 57 00:02:55,020 --> 00:02:56,970 Now for times number of false negatives. 58 00:02:56,970 --> 00:03:03,330 Basically how many times are classified I predicted a number would have been a number that was supposed 59 00:03:03,330 --> 00:03:04,250 to be a 7. 60 00:03:04,500 --> 00:03:06,530 So you can see all these all these crimes here. 61 00:03:06,570 --> 00:03:11,730 Numbers were supposed to be a seven but it was actually predicted to be indifferent to us like 0 1 2 62 00:03:11,730 --> 00:03:12,960 especially. 63 00:03:12,960 --> 00:03:14,700 So let's sum this up. 64 00:03:14,700 --> 00:03:18,980 This rule here everything here except to tell then we'll give you 18. 65 00:03:19,230 --> 00:03:21,090 And that's exactly what we calculate here. 66 00:03:21,100 --> 00:03:25,310 And we get ninety eight point to four percent and that's our recall. 67 00:03:25,370 --> 00:03:29,810 And now let's move on to precision. 68 00:03:29,830 --> 00:03:34,660 So looking at precision we know it's number of correct predictions over how many occurrences of that 69 00:03:34,660 --> 00:03:36,650 class when the test data set. 70 00:03:36,730 --> 00:03:38,770 That's another way of seeing what this is here. 71 00:03:38,980 --> 00:03:43,630 True positives over true positives Plus are false positives. 72 00:03:43,630 --> 00:03:45,230 So again let's look at number seven. 73 00:03:45,310 --> 00:03:45,870 OK. 74 00:03:46,030 --> 00:03:47,240 So now we go the other way. 75 00:03:47,290 --> 00:03:48,900 This is interesting. 76 00:03:48,970 --> 00:03:51,610 So are true positives again. 77 00:03:51,840 --> 00:03:52,980 It isn't 10. 78 00:03:53,320 --> 00:03:57,400 And what about false positives or false positives here. 79 00:03:57,400 --> 00:04:05,610 Basically all this time the classify was predicting something to be a 7 when it was actually 0 to a 80 00:04:05,650 --> 00:04:07,080 tree above or below. 81 00:04:07,450 --> 00:04:14,620 So those are the false positives for sevenths and we just sum this up everything here gives us a thousand 82 00:04:14,620 --> 00:04:15,240 seventeen. 83 00:04:15,300 --> 00:04:18,060 One two tree and then three plus four. 84 00:04:18,060 --> 00:04:25,560 It seems that that's exactly how we get ninety nine point one percent so basically we don't actually 85 00:04:25,560 --> 00:04:26,710 have to do it manually. 86 00:04:26,910 --> 00:04:32,370 So I could learn actually does it generate generates a report for us automatically that gives us record 87 00:04:32,370 --> 00:04:35,000 precision F1 and support. 88 00:04:35,010 --> 00:04:37,390 I think you remember and you guys know what one is. 89 00:04:37,560 --> 00:04:41,040 I haven't actually dealt with support but I'll talk about it now in the next slide. 90 00:04:41,340 --> 00:04:46,140 But you can see we have precision we have recall and we have all one score here. 91 00:04:46,190 --> 00:04:47,350 All right. 92 00:04:47,640 --> 00:04:49,500 Now actually I can talk about support right now. 93 00:04:49,500 --> 00:04:50,820 It's actually quite easy. 94 00:04:50,850 --> 00:04:53,180 Support is basically you see the numbers here. 95 00:04:53,460 --> 00:04:54,530 Ten twenty eight. 96 00:04:54,540 --> 00:04:55,840 What was that. 97 00:04:55,980 --> 00:05:01,920 Go back to it here 10 20 it is basically true positives plus false negatives here. 98 00:05:02,220 --> 00:05:05,430 So support just gives us that some here in the column. 99 00:05:05,430 --> 00:05:09,630 So if you look at column 0 Let's go back to it. 100 00:05:10,080 --> 00:05:12,760 Column zero would be everything here. 101 00:05:13,760 --> 00:05:14,130 Sorry. 102 00:05:14,170 --> 00:05:14,920 Everything in disarray. 103 00:05:14,980 --> 00:05:15,970 Added up here. 104 00:05:15,970 --> 00:05:20,200 So this is 977 plus 1 2 3 980. 105 00:05:20,200 --> 00:05:23,550 And that gives us our support here and support. 106 00:05:23,550 --> 00:05:24,490 It basically is useful. 107 00:05:24,490 --> 00:05:31,540 How many times of classify is actually basically missing data essentially missing classifications because 108 00:05:31,630 --> 00:05:39,560 think about it intuitively we have nine hundred and eighty zeroes represented here in US in this report. 109 00:05:39,560 --> 00:05:46,350 All right now what's what is telling us here is that we basically had a that's how much is zero. 110 00:05:46,390 --> 00:05:48,790 How many zeros were in our report. 111 00:05:48,790 --> 00:05:53,290 So now we can actually use that as a basis to can gauge what is happening here. 112 00:05:53,320 --> 00:05:57,650 So you can also see if any class imbalances in this are in the data here. 113 00:05:57,950 --> 00:06:03,430 So what about this and imbalances are essentially which you can usually check before you even reach 114 00:06:03,430 --> 00:06:03,870 this point. 115 00:06:03,880 --> 00:06:06,810 You can easily check it when you have your test entering data. 116 00:06:07,150 --> 00:06:15,520 Just check to see how many quantities it seems are the conchs I should see of data of each class of 117 00:06:15,520 --> 00:06:15,780 data. 118 00:06:15,820 --> 00:06:20,040 You have no data set. 119 00:06:20,060 --> 00:06:23,020 So how do we analyze overclassification report. 120 00:06:23,030 --> 00:06:29,830 So basically we can just quickly interpret something here high record with low precision that is bad. 121 00:06:30,080 --> 00:06:35,700 And let me tell you what this tells us that most of the positive examples are being correctly recognize. 122 00:06:35,840 --> 00:06:40,350 That means a lot of false negatives but there are a lot of false positives. 123 00:06:40,670 --> 00:06:43,330 And that means a lot of the other classes have been predicted. 124 00:06:43,400 --> 00:06:51,450 As a class in question and Alternatively we can have lower recall with high precision. 125 00:06:51,720 --> 00:06:52,630 What does that mean. 126 00:06:52,650 --> 00:06:58,640 It means are classified as missing a lot of the positive examples has a high false negative rate but 127 00:06:58,720 --> 00:07:03,760 do as I predict as positive on the positive side of false positives. 128 00:07:04,110 --> 00:07:10,570 So we can use our prosecution classification report to sort of gauge what's actually happening. 129 00:07:10,710 --> 00:07:13,220 In this example everything looks pretty good. 130 00:07:13,250 --> 00:07:13,740 All right. 131 00:07:13,920 --> 00:07:18,720 But later on you look at some examples where we generate these reports and you can actually analyze 132 00:07:18,720 --> 00:07:22,350 and figure out which class overclassify is having trouble with. 133 00:07:24,110 --> 00:07:28,360 So let's quickly take a look at the code to generate this confusion matrix. 134 00:07:28,360 --> 00:07:29,000 All right. 135 00:07:29,300 --> 00:07:35,760 And later on we're likely to see him could look at misclassified data for now this is the generic administrating 136 00:07:36,130 --> 00:07:36,470 code. 137 00:07:36,500 --> 00:07:40,480 You've seen a few times before it's using a lot of my examples. 138 00:07:40,910 --> 00:07:43,700 There's one thing I wanted to show you here in this file. 139 00:07:43,790 --> 00:07:50,720 Basically when we see of we still history here some people have asked me How can I can I see if my history 140 00:07:50,720 --> 00:07:51,200 file. 141 00:07:51,500 --> 00:07:56,900 And look at it again because I've spent like hours or maybe a week or days training a classifier and 142 00:07:56,900 --> 00:08:01,090 I want to actually see if the plots were not seeing the image when I should see the file. 143 00:08:01,430 --> 00:08:04,330 Yes you can you can use pite one function called pickle. 144 00:08:04,700 --> 00:08:09,980 And basically what it does here just stores a file as a pickled file pickle file is basically an array 145 00:08:09,980 --> 00:08:12,360 of data as a method of storage. 146 00:08:12,410 --> 00:08:16,920 I'm not going to get into the detail now but just know it's a way we can store files. 147 00:08:17,570 --> 00:08:22,910 So what we do here we just pick allowed to create a file we've been it's basically us telling us we're 148 00:08:22,910 --> 00:08:27,260 going to create this pickle file and then we just dump this file. 149 00:08:27,260 --> 00:08:28,880 This is some history to history file. 150 00:08:28,880 --> 00:08:33,240 We want to save and then close the file and it's done. 151 00:08:33,620 --> 00:08:39,170 And similarly if we want to look at this file because it's simply just loaded back here and here we 152 00:08:39,170 --> 00:08:39,390 go. 153 00:08:39,420 --> 00:08:44,360 Well we have all just for one epoch there was more than one book it would look a lot bigger. 154 00:08:44,690 --> 00:08:47,540 But basically it's a dictionary file or just on file. 155 00:08:47,540 --> 00:08:50,110 For those of you who come from a javascript background. 156 00:08:50,810 --> 00:08:52,760 And basically this is how it looks. 157 00:08:52,760 --> 00:08:57,460 We have a loss accuracy validation accuracy of Allision loss and values for it here. 158 00:08:58,500 --> 00:09:01,380 As a key these are the keys and these are the values. 159 00:09:01,380 --> 00:09:08,370 So now we can get some plots but these plots obviously for one epoch are pretty much not fun to look 160 00:09:08,370 --> 00:09:08,730 at. 161 00:09:08,760 --> 00:09:15,040 Just one point here and an accuracy chart one point here and one point here that one can actually see 162 00:09:15,040 --> 00:09:17,240 it's really quite good. 163 00:09:17,820 --> 00:09:19,050 This is what I wanted to show you. 164 00:09:19,230 --> 00:09:22,560 This here is a good fusion matrix at ossification report. 165 00:09:22,560 --> 00:09:25,400 So we import from Escaflowne metrics. 166 00:09:25,560 --> 00:09:27,330 Both of these functions here. 167 00:09:27,600 --> 00:09:30,850 And basically what we do we just get our predictions here. 168 00:09:30,960 --> 00:09:38,010 So we run ex-s accesses or test data or validation data through our models are pretty classes and basically 169 00:09:38,010 --> 00:09:43,870 we just print out the classification or report classification report just takes two arguments. 170 00:09:43,950 --> 00:09:49,490 We tested the X test labels here to White US labels here and predictions. 171 00:09:49,500 --> 00:09:54,640 So basically what we're doing here we're comparing labels to labels. 172 00:09:54,870 --> 00:10:00,570 And the reason we have to use the max function is because our labels before will not want encoded. 173 00:10:00,630 --> 00:10:06,420 So it's not like for like we actually have to knock reconvicted back into basically a one for one type 174 00:10:06,420 --> 00:10:07,500 matching. 175 00:10:07,530 --> 00:10:11,010 So this is basically what our protection matrix give us. 176 00:10:11,130 --> 00:10:16,500 And basically this is what the classification of what looks like which we saw in our slides. 177 00:10:16,840 --> 00:10:17,890 Are It averages here. 178 00:10:17,910 --> 00:10:23,310 They don't really tell us that much closer but the same may be far more interesting datasets and of 179 00:10:23,300 --> 00:10:28,560 course those values would differ and confusion matrix is done here. 180 00:10:28,980 --> 00:10:33,720 Basically the same thing same exact arguments as above and we get it here. 181 00:10:34,350 --> 00:10:38,940 So that's it for confusion metrics and our misclassification.