1
00:00:00,510 --> 00:00:00,940
Hey.

2
00:00:00,960 --> 00:00:05,520
And welcome back to chapter eleven point one where we go into the confusion matrix and then calculate

3
00:00:05,520 --> 00:00:07,180
precision and recall.

4
00:00:07,190 --> 00:00:08,530
So let's get started.

5
00:00:08,880 --> 00:00:15,480
So before I dive into Python that book and we actually would you an example using sikat confusion matrix

6
00:00:15,480 --> 00:00:16,240
function.

7
00:00:16,530 --> 00:00:19,310
Let's take a look at what the confusion matrix actually looks like.

8
00:00:19,350 --> 00:00:22,700
Now typically which I shown you before.

9
00:00:22,980 --> 00:00:29,490
You have basically true true positives true negatives false positives and false negatives.

10
00:00:29,610 --> 00:00:31,510
And that was in a binary class analysis.

11
00:00:31,550 --> 00:00:37,400
Now I'm going to go a step further and make you understand this concept using a multiclass problem and

12
00:00:37,400 --> 00:00:41,010
we're using actual real world results for amnesty the set.

13
00:00:41,250 --> 00:00:45,410
So this big worry kind of looks like this big matrix looks a bit strange.

14
00:00:45,450 --> 00:00:48,720
However you do pick up initially that is a pattern right here.

15
00:00:48,900 --> 00:00:50,170
That is a diagonal way.

16
00:00:50,190 --> 00:00:53,470
These are very large numbers and then there are small numbers on the outskirts.

17
00:00:53,760 --> 00:00:55,740
What do these numbers actually mean.

18
00:00:56,250 --> 00:00:57,770
So let's take a look now.

19
00:00:58,230 --> 00:00:59,110
I've made it simple.

20
00:00:59,130 --> 00:01:00,820
We know we're looking at amnesty to set.

21
00:01:00,840 --> 00:01:04,210
So we have 10 classes that's 0 2 9 9.

22
00:01:04,290 --> 00:01:05,100
Likewise here.

23
00:01:05,190 --> 00:01:08,320
And what this call them here is the predicted value.

24
00:01:08,340 --> 00:01:15,230
So these these numbers here mean that classify a predicted zero and true value was actually zero.

25
00:01:15,240 --> 00:01:17,180
So that's why it is a big number here.

26
00:01:17,440 --> 00:01:20,170
So to one shoot value was actually 1.

27
00:01:20,370 --> 00:01:23,810
So having large numbers in this diagonal is a good thing.

28
00:01:24,000 --> 00:01:29,550
Having large numbers outside of his batting so we can see we have some large numbers here.

29
00:01:29,550 --> 00:01:30,360
We have an 11.

30
00:01:30,360 --> 00:01:32,940
We have six to five here.

31
00:01:33,220 --> 00:01:36,950
So now let's take a look at these numbers so I've highlighted them here.

32
00:01:37,220 --> 00:01:38,790
So what do they actually mean.

33
00:01:38,790 --> 00:01:44,350
It means that our classify predicted to when it was actually 7.

34
00:01:44,460 --> 00:01:49,920
So classify as confusing systems with two is it seeing a seven but it's classifying it as a two which

35
00:01:49,920 --> 00:01:50,660
is wrong.

36
00:01:50,940 --> 00:01:56,070
And likewise for sixes and zeroes you know nines and fours here.

37
00:01:56,970 --> 00:01:58,680
Sorry this one here.

38
00:01:59,120 --> 00:02:00,990
And it's as well.

39
00:02:00,990 --> 00:02:08,550
So what I mean is that let's look at nines and fours classify a predicted four but five times it was

40
00:02:08,550 --> 00:02:10,160
actually a 9.

41
00:02:10,260 --> 00:02:11,660
So we can see the biggest problem.

42
00:02:11,670 --> 00:02:17,400
Oh emulous classify as facing is confusing to his and sevens which is actually a real problem for a

43
00:02:17,400 --> 00:02:18,630
lot of humans.

44
00:02:18,660 --> 00:02:20,070
My handwriting isn't very good.

45
00:02:20,070 --> 00:02:21,690
I'll be the first one to admit that.

46
00:02:21,810 --> 00:02:25,700
And lots of times I'm looking at numbers I write and I'm like Is that a 2.

47
00:02:25,700 --> 00:02:26,550
Was it a 7.

48
00:02:26,550 --> 00:02:34,050
So we can see all calcify sort of leaning generally like a human with lean and interpreting all misinterpreting

49
00:02:34,080 --> 00:02:36,840
results just like a human like ourselves would.

50
00:02:37,080 --> 00:02:41,860
So let's actually work out or recall value based on this real world data.

51
00:02:42,210 --> 00:02:43,770
So let's take a look at the number seven.

52
00:02:43,860 --> 00:02:44,350
All right.

53
00:02:44,550 --> 00:02:47,040
So this is the true classes for number 7 here.

54
00:02:47,040 --> 00:02:49,220
So we saw a classified got it right.

55
00:02:49,410 --> 00:02:52,800
One thousand and ten times that's true positives here.

56
00:02:53,130 --> 00:02:55,020
So how do we get the number of false negatives.

57
00:02:55,020 --> 00:02:56,970
Now for times number of false negatives.

58
00:02:56,970 --> 00:03:03,330
Basically how many times are classified I predicted a number would have been a number that was supposed

59
00:03:03,330 --> 00:03:04,250
to be a 7.

60
00:03:04,500 --> 00:03:06,530
So you can see all these all these crimes here.

61
00:03:06,570 --> 00:03:11,730
Numbers were supposed to be a seven but it was actually predicted to be indifferent to us like 0 1 2

62
00:03:11,730 --> 00:03:12,960
especially.

63
00:03:12,960 --> 00:03:14,700
So let's sum this up.

64
00:03:14,700 --> 00:03:18,980
This rule here everything here except to tell then we'll give you 18.

65
00:03:19,230 --> 00:03:21,090
And that's exactly what we calculate here.

66
00:03:21,100 --> 00:03:25,310
And we get ninety eight point to four percent and that's our recall.

67
00:03:25,370 --> 00:03:29,810
And now let's move on to precision.

68
00:03:29,830 --> 00:03:34,660
So looking at precision we know it's number of correct predictions over how many occurrences of that

69
00:03:34,660 --> 00:03:36,650
class when the test data set.

70
00:03:36,730 --> 00:03:38,770
That's another way of seeing what this is here.

71
00:03:38,980 --> 00:03:43,630
True positives over true positives Plus are false positives.

72
00:03:43,630 --> 00:03:45,230
So again let's look at number seven.

73
00:03:45,310 --> 00:03:45,870
OK.

74
00:03:46,030 --> 00:03:47,240
So now we go the other way.

75
00:03:47,290 --> 00:03:48,900
This is interesting.

76
00:03:48,970 --> 00:03:51,610
So are true positives again.

77
00:03:51,840 --> 00:03:52,980
It isn't 10.

78
00:03:53,320 --> 00:03:57,400
And what about false positives or false positives here.

79
00:03:57,400 --> 00:04:05,610
Basically all this time the classify was predicting something to be a 7 when it was actually 0 to a

80
00:04:05,650 --> 00:04:07,080
tree above or below.

81
00:04:07,450 --> 00:04:14,620
So those are the false positives for sevenths and we just sum this up everything here gives us a thousand

82
00:04:14,620 --> 00:04:15,240
seventeen.

83
00:04:15,300 --> 00:04:18,060
One two tree and then three plus four.

84
00:04:18,060 --> 00:04:25,560
It seems that that's exactly how we get ninety nine point one percent so basically we don't actually

85
00:04:25,560 --> 00:04:26,710
have to do it manually.

86
00:04:26,910 --> 00:04:32,370
So I could learn actually does it generate generates a report for us automatically that gives us record

87
00:04:32,370 --> 00:04:35,000
precision F1 and support.

88
00:04:35,010 --> 00:04:37,390
I think you remember and you guys know what one is.

89
00:04:37,560 --> 00:04:41,040
I haven't actually dealt with support but I'll talk about it now in the next slide.

90
00:04:41,340 --> 00:04:46,140
But you can see we have precision we have recall and we have all one score here.

91
00:04:46,190 --> 00:04:47,350
All right.

92
00:04:47,640 --> 00:04:49,500
Now actually I can talk about support right now.

93
00:04:49,500 --> 00:04:50,820
It's actually quite easy.

94
00:04:50,850 --> 00:04:53,180
Support is basically you see the numbers here.

95
00:04:53,460 --> 00:04:54,530
Ten twenty eight.

96
00:04:54,540 --> 00:04:55,840
What was that.

97
00:04:55,980 --> 00:05:01,920
Go back to it here 10 20 it is basically true positives plus false negatives here.

98
00:05:02,220 --> 00:05:05,430
So support just gives us that some here in the column.

99
00:05:05,430 --> 00:05:09,630
So if you look at column 0 Let's go back to it.

100
00:05:10,080 --> 00:05:12,760
Column zero would be everything here.

101
00:05:13,760 --> 00:05:14,130
Sorry.

102
00:05:14,170 --> 00:05:14,920
Everything in disarray.

103
00:05:14,980 --> 00:05:15,970
Added up here.

104
00:05:15,970 --> 00:05:20,200
So this is 977 plus 1 2 3 980.

105
00:05:20,200 --> 00:05:23,550
And that gives us our support here and support.

106
00:05:23,550 --> 00:05:24,490
It basically is useful.

107
00:05:24,490 --> 00:05:31,540
How many times of classify is actually basically missing data essentially missing classifications because

108
00:05:31,630 --> 00:05:39,560
think about it intuitively we have nine hundred and eighty zeroes represented here in US in this report.

109
00:05:39,560 --> 00:05:46,350
All right now what's what is telling us here is that we basically had a that's how much is zero.

110
00:05:46,390 --> 00:05:48,790
How many zeros were in our report.

111
00:05:48,790 --> 00:05:53,290
So now we can actually use that as a basis to can gauge what is happening here.

112
00:05:53,320 --> 00:05:57,650
So you can also see if any class imbalances in this are in the data here.

113
00:05:57,950 --> 00:06:03,430
So what about this and imbalances are essentially which you can usually check before you even reach

114
00:06:03,430 --> 00:06:03,870
this point.

115
00:06:03,880 --> 00:06:06,810
You can easily check it when you have your test entering data.

116
00:06:07,150 --> 00:06:15,520
Just check to see how many quantities it seems are the conchs I should see of data of each class of

117
00:06:15,520 --> 00:06:15,780
data.

118
00:06:15,820 --> 00:06:20,040
You have no data set.

119
00:06:20,060 --> 00:06:23,020
So how do we analyze overclassification report.

120
00:06:23,030 --> 00:06:29,830
So basically we can just quickly interpret something here high record with low precision that is bad.

121
00:06:30,080 --> 00:06:35,700
And let me tell you what this tells us that most of the positive examples are being correctly recognize.

122
00:06:35,840 --> 00:06:40,350
That means a lot of false negatives but there are a lot of false positives.

123
00:06:40,670 --> 00:06:43,330
And that means a lot of the other classes have been predicted.

124
00:06:43,400 --> 00:06:51,450
As a class in question and Alternatively we can have lower recall with high precision.

125
00:06:51,720 --> 00:06:52,630
What does that mean.

126
00:06:52,650 --> 00:06:58,640
It means are classified as missing a lot of the positive examples has a high false negative rate but

127
00:06:58,720 --> 00:07:03,760
do as I predict as positive on the positive side of false positives.

128
00:07:04,110 --> 00:07:10,570
So we can use our prosecution classification report to sort of gauge what's actually happening.

129
00:07:10,710 --> 00:07:13,220
In this example everything looks pretty good.

130
00:07:13,250 --> 00:07:13,740
All right.

131
00:07:13,920 --> 00:07:18,720
But later on you look at some examples where we generate these reports and you can actually analyze

132
00:07:18,720 --> 00:07:22,350
and figure out which class overclassify is having trouble with.

133
00:07:24,110 --> 00:07:28,360
So let's quickly take a look at the code to generate this confusion matrix.

134
00:07:28,360 --> 00:07:29,000
All right.

135
00:07:29,300 --> 00:07:35,760
And later on we're likely to see him could look at misclassified data for now this is the generic administrating

136
00:07:36,130 --> 00:07:36,470
code.

137
00:07:36,500 --> 00:07:40,480
You've seen a few times before it's using a lot of my examples.

138
00:07:40,910 --> 00:07:43,700
There's one thing I wanted to show you here in this file.

139
00:07:43,790 --> 00:07:50,720
Basically when we see of we still history here some people have asked me How can I can I see if my history

140
00:07:50,720 --> 00:07:51,200
file.

141
00:07:51,500 --> 00:07:56,900
And look at it again because I've spent like hours or maybe a week or days training a classifier and

142
00:07:56,900 --> 00:08:01,090
I want to actually see if the plots were not seeing the image when I should see the file.

143
00:08:01,430 --> 00:08:04,330
Yes you can you can use pite one function called pickle.

144
00:08:04,700 --> 00:08:09,980
And basically what it does here just stores a file as a pickled file pickle file is basically an array

145
00:08:09,980 --> 00:08:12,360
of data as a method of storage.

146
00:08:12,410 --> 00:08:16,920
I'm not going to get into the detail now but just know it's a way we can store files.

147
00:08:17,570 --> 00:08:22,910
So what we do here we just pick allowed to create a file we've been it's basically us telling us we're

148
00:08:22,910 --> 00:08:27,260
going to create this pickle file and then we just dump this file.

149
00:08:27,260 --> 00:08:28,880
This is some history to history file.

150
00:08:28,880 --> 00:08:33,240
We want to save and then close the file and it's done.

151
00:08:33,620 --> 00:08:39,170
And similarly if we want to look at this file because it's simply just loaded back here and here we

152
00:08:39,170 --> 00:08:39,390
go.

153
00:08:39,420 --> 00:08:44,360
Well we have all just for one epoch there was more than one book it would look a lot bigger.

154
00:08:44,690 --> 00:08:47,540
But basically it's a dictionary file or just on file.

155
00:08:47,540 --> 00:08:50,110
For those of you who come from a javascript background.

156
00:08:50,810 --> 00:08:52,760
And basically this is how it looks.

157
00:08:52,760 --> 00:08:57,460
We have a loss accuracy validation accuracy of Allision loss and values for it here.

158
00:08:58,500 --> 00:09:01,380
As a key these are the keys and these are the values.

159
00:09:01,380 --> 00:09:08,370
So now we can get some plots but these plots obviously for one epoch are pretty much not fun to look

160
00:09:08,370 --> 00:09:08,730
at.

161
00:09:08,760 --> 00:09:15,040
Just one point here and an accuracy chart one point here and one point here that one can actually see

162
00:09:15,040 --> 00:09:17,240
it's really quite good.

163
00:09:17,820 --> 00:09:19,050
This is what I wanted to show you.

164
00:09:19,230 --> 00:09:22,560
This here is a good fusion matrix at ossification report.

165
00:09:22,560 --> 00:09:25,400
So we import from Escaflowne metrics.

166
00:09:25,560 --> 00:09:27,330
Both of these functions here.

167
00:09:27,600 --> 00:09:30,850
And basically what we do we just get our predictions here.

168
00:09:30,960 --> 00:09:38,010
So we run ex-s accesses or test data or validation data through our models are pretty classes and basically

169
00:09:38,010 --> 00:09:43,870
we just print out the classification or report classification report just takes two arguments.

170
00:09:43,950 --> 00:09:49,490
We tested the X test labels here to White US labels here and predictions.

171
00:09:49,500 --> 00:09:54,640
So basically what we're doing here we're comparing labels to labels.

172
00:09:54,870 --> 00:10:00,570
And the reason we have to use the max function is because our labels before will not want encoded.

173
00:10:00,630 --> 00:10:06,420
So it's not like for like we actually have to knock reconvicted back into basically a one for one type

174
00:10:06,420 --> 00:10:07,500
matching.

175
00:10:07,530 --> 00:10:11,010
So this is basically what our protection matrix give us.

176
00:10:11,130 --> 00:10:16,500
And basically this is what the classification of what looks like which we saw in our slides.

177
00:10:16,840 --> 00:10:17,890
Are It averages here.

178
00:10:17,910 --> 00:10:23,310
They don't really tell us that much closer but the same may be far more interesting datasets and of

179
00:10:23,300 --> 00:10:28,560
course those values would differ and confusion matrix is done here.

180
00:10:28,980 --> 00:10:33,720
Basically the same thing same exact arguments as above and we get it here.

181
00:10:34,350 --> 00:10:38,940
So that's it for confusion metrics and our misclassification.