File size: 12,485 Bytes
17e2002
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
1
00:00:01,360 --> 00:00:06,950
OK so let's not talk about trolling and dribbling is actually a very important open C-v function that

2
00:00:06,950 --> 00:00:12,100
is quite useful and you'll find we use it in so many of for many projects going forward.

3
00:00:12,120 --> 00:00:17,190
So what exactly is troubling troubling is converting an image to its binary form.

4
00:00:17,420 --> 00:00:19,520
But what does that mean exactly.

5
00:00:19,520 --> 00:00:25,880
So if you look at the function DCV to the Trishul function this gives you an idea of what to expect.

6
00:00:25,880 --> 00:00:28,640
So imagine we have the input image here.

7
00:00:29,050 --> 00:00:35,350
It Trishul value which is very important a max value and a threshold type.

8
00:00:35,440 --> 00:00:38,980
So I'm just going to quickly illustrate what happens in this function here.

9
00:00:39,250 --> 00:00:41,540
So let's look at his gray scale image here.

10
00:00:41,750 --> 00:00:46,760
And also please note all Trishul functions use grayscale images so images need to be converted.

11
00:00:46,810 --> 00:00:47,870
Integral scale before.

12
00:00:47,890 --> 00:00:49,200
Thresholding here.

13
00:00:49,750 --> 00:00:51,180
So this is grayscale image.

14
00:00:51,190 --> 00:00:55,410
It goes all the way from 0 to 255 at the bottom here.

15
00:00:55,810 --> 00:01:00,840
Let's assume this is a wheel midway point 127 that's over Trishul value.

16
00:01:01,210 --> 00:01:07,160
So and max value here is a value we want to set everything above that threshold.

17
00:01:07,390 --> 00:01:16,800
So in this example if this line here is 1:27 everything above it when we are going to 2:55 becomes 255.

18
00:01:16,870 --> 00:01:22,850
So this portion here becomes white and this push in above here becomes black which is zero.

19
00:01:23,380 --> 00:01:28,450
So it is different forms of travels which we see in the code but that just illustrates exactly what's

20
00:01:28,450 --> 00:01:29,410
happening here.

21
00:01:29,770 --> 00:01:35,140
There's also a type of Trishul and call adaptive thresholding which will come to an accord as well.

22
00:01:35,530 --> 00:01:40,780
And you may have noticed I used the word binary station here biner is Ishan or troubling refusing to

23
00:01:40,780 --> 00:01:48,720
see him doing it means converting to a fully greyscale image into just two points Max point and a loop

24
00:01:48,740 --> 00:01:53,320
when traditionally it's zero for a little point and to 55 for the max point.

25
00:01:53,320 --> 00:01:55,480
So white or black.

26
00:01:55,570 --> 00:01:55,860
OK.

27
00:01:55,870 --> 00:02:00,400
So let's take a look at implementing some of those Trishul methods or we just discuss.

28
00:02:00,400 --> 00:02:02,970
So as you can see it has five different thresholding methods.

29
00:02:02,970 --> 00:02:11,490
We're going to use here is binary binary inverse truncation Trisha's 0 tresses trash to zero inverse.

30
00:02:11,500 --> 00:02:14,940
So those are undiscovered and we can actually figure out exactly what each is doing.

31
00:02:17,260 --> 00:02:22,900
Or just to know that I actually said to Windows specifically here it isn't an open civies and so nice

32
00:02:22,900 --> 00:02:24,910
to automatically or do those windows.

33
00:02:24,990 --> 00:02:28,430
There are some C-v to move into functions that allow you to do that.

34
00:02:28,450 --> 00:02:30,560
But I'm not going to discuss that right now.

35
00:02:30,910 --> 00:02:32,470
So anyway back to thresholding.

36
00:02:32,560 --> 00:02:34,520
So this is our original image here.

37
00:02:35,140 --> 00:02:40,500
And remember from the slides I said we said 127 to be the Trishul value.

38
00:02:40,840 --> 00:02:46,230
So in Trishul binary everything that's below 127 which is of value be used in our code.

39
00:02:46,480 --> 00:02:51,480
Hopefully you remember that goes to black and everything above that goes to white.

40
00:02:51,520 --> 00:02:57,160
So Trishul binary Inv. actually and undecidedly does it do.

41
00:02:57,200 --> 00:02:59,310
Opposite the inverse of that.

42
00:02:59,380 --> 00:03:06,670
So everything docket and 1:27 goes to white and everything higher than 127 goes to black.

43
00:03:06,760 --> 00:03:10,650
Now trust trunk there's something a little bit different.

44
00:03:10,720 --> 00:03:16,360
What it does is that it takes everything that's actually brighter or higher than 1:27 going to Swee

45
00:03:17,080 --> 00:03:24,890
and Cupps it at 127 which is why everything just stops a disgrace here and continues along the screen.

46
00:03:25,240 --> 00:03:30,520
So what about G-0 0 0 actually does something of the opposite here.

47
00:03:30,880 --> 00:03:37,620
Everything that's actually below 127 actually gets capped to black similarly to binary here.

48
00:03:37,690 --> 00:03:40,770
However it leaves everything above 127 Lewan.

49
00:03:41,260 --> 00:03:46,490
So it's a nice effect can be useful sometimes and Trisha's 0.

50
00:03:46,550 --> 00:03:49,220
Invis actually does the opposite.

51
00:03:49,690 --> 00:03:53,950
So as you can see it maintains this color here.

52
00:03:53,950 --> 00:04:01,330
This is the initial 0 to 127 part but everything above 127 just like in Trishul binary invis goes to

53
00:04:01,330 --> 00:04:03,430
black.

54
00:04:03,530 --> 00:04:05,280
So that's close these windows now.

55
00:04:05,560 --> 00:04:11,730
And we can actually just quickly look at it could you remember the input images of Fu's here.

56
00:04:12,080 --> 00:04:16,750

57
1:27 is a Trishul value 255 is the max value set.

58
00:04:16,760 --> 00:04:20,450
That's de-valued cause too when it's above that Trishul or vice versa.

59
00:04:20,900 --> 00:04:25,320
Zero is actually used in the opposite case for most of these Tresor log algorithms.

60
00:04:25,730 --> 00:04:29,650
And this here fresh binary This is where we defined threshold type.

61
00:04:29,650 --> 00:04:30,930
We want to use.

62
00:04:30,950 --> 00:04:34,760
So it's a simple argument function CB2 Trishul.

63
00:04:34,880 --> 00:04:36,400
Hope you have fun using it.

64
00:04:37,870 --> 00:04:43,750
So you may have fun with that one weakness of these Tressell algorithms is ID all require us to set

65
00:04:43,750 --> 00:04:47,710
the value at 127 or whatever value we wish to use.

66
00:04:47,710 --> 00:04:54,370
No that's not convenient when you're actually doing it with Lexie's scanned documents or anything of

67
00:04:54,370 --> 00:04:55,130
the sort.

68
00:04:55,420 --> 00:04:57,430
So is there a better way of doing this.

69
00:04:57,430 --> 00:04:58,320
And yes it is.

70
00:04:58,330 --> 00:05:02,300
It's called adaptive thresholding and I have a slide in a presentation.

71
00:05:02,310 --> 00:05:03,890
We'll discuss these methods here.

72
00:05:04,210 --> 00:05:06,840
But there's quite a few adaptive thresholding methods.

73
00:05:06,850 --> 00:05:16,780
Use an open C-v is adaptive mean trash is OS2 his method which is quite popular and is Gosia or Austin's

74
00:05:16,780 --> 00:05:17,490
method.

75
00:05:17,860 --> 00:05:21,400
So let's run this code and we can actually see what's going on here.

76
00:05:22,210 --> 00:05:24,530
So this is the original image we're looking at.

77
00:05:24,530 --> 00:05:26,880
It's Charles Darwin version of species.

78
00:05:27,310 --> 00:05:29,790
So that's the original image.

79
00:05:29,830 --> 00:05:36,020
This here is a regular Trishna binary Trishul that previously used as you can see it's good.

80
00:05:36,100 --> 00:05:42,640
It's not bad but if you wanted to get this text book we would have had to use maybe a low or high value

81
00:05:42,640 --> 00:05:45,690
of 127.

82
00:05:45,720 --> 00:05:50,880
So this is it here with adaptive martially as you can see you can compare directly to this image and

83
00:05:50,880 --> 00:05:53,140
you can see the text is a lot more clear.

84
00:05:53,460 --> 00:05:57,490
So it setting a Trishul a value of 127 or anything of the like.

85
00:05:57,600 --> 00:05:59,520
It actually does a better job already.

86
00:05:59,820 --> 00:06:02,300
So we can all his method do a better job.

87
00:06:02,850 --> 00:06:03,480
Perhaps it can.

88
00:06:03,480 --> 00:06:05,810
I mean there's not much difference separating these images.

89
00:06:06,060 --> 00:06:09,670
But in my experience his method actually is quite handy.

90
00:06:09,690 --> 00:06:11,740
What about Astor's Gaussian method.

91
00:06:12,840 --> 00:06:17,280
Again this actually looks very similar it is on the onto the differences between these images here.

92
00:06:17,490 --> 00:06:27,370
However in your own image data set you may want to actually play and try these different logarithms.

93
00:06:27,430 --> 00:06:31,170
So let's listen a bit more about the adaptive treacly methods that we just saw.

94
00:06:31,450 --> 00:06:33,750
So the first one we saw was adaptive Trishul.

95
00:06:34,120 --> 00:06:35,010
And as you can.

96
00:06:35,050 --> 00:06:39,520
As you remembered adaptive turtling has a big advantage in that it reduces the uncertainty in setting

97
00:06:39,520 --> 00:06:40,630
a Trishul value.

98
00:06:40,840 --> 00:06:46,600
So the input does nutritional value in these parameters here is a max value which is 2:55 the adaptive

99
00:06:46,600 --> 00:06:50,700
type Trishul type block size which needs to be in odd numbers.

100
00:06:50,710 --> 00:06:57,390
One of those weird pensively quirks and a constant and lot of open civic work that needs to be said

101
00:06:57,400 --> 00:07:02,230
that faith Well traditionally five you can experiment with different values and if you get better results

102
00:07:02,410 --> 00:07:02,940
that's good.

103
00:07:03,900 --> 00:07:10,820
So that's quickly going.

104
00:07:10,840 --> 00:07:11,950
So these are the values here.

105
00:07:11,950 --> 00:07:16,360
So this is the adaptive type and I believe this is the only type available for dysfunction right now

106
00:07:17,140 --> 00:07:20,710
and this is the troubling type we use here as well.

107
00:07:20,710 --> 00:07:25,800
And these are the blocks and the constant here and that give us the results we saw earlier.

108
00:07:25,930 --> 00:07:27,480
So let's go back to a presentation now

109
00:07:30,720 --> 00:07:35,490
and they actually just misspoke when I just said adaptive trust means see was the only adaptive threshold

110
00:07:35,490 --> 00:07:42,690
type we could use in this function is actually adaptive Shesh Gaussians see which as you may remember

111
00:07:42,690 --> 00:07:46,930
is a weighted sum of neighborhood pixels under that Gaussian window.

112
00:07:46,940 --> 00:07:51,860
So let's discuss OS2 which is actually one of the best adaptive freshly methods.

113
00:07:51,890 --> 00:07:54,510
So what OS2 does is that it tries to find.

114
00:07:54,590 --> 00:07:59,360
Well it looks at the histogram of the range of intensities in the image here and it tries to find the

115
00:07:59,360 --> 00:08:05,930
peaks or the moods of that of the histogram here and then it actually finds a value that separates these

116
00:08:05,930 --> 00:08:06,770
histograms.

117
00:08:06,770 --> 00:08:14,100
The best value I can separate is two histograms that you actually get the best thresholding value here.

118
00:08:14,750 --> 00:08:19,490
So in most cases where you have changes in light intensity I would encourage you to use one of these

119
00:08:19,490 --> 00:08:24,690
thresholding adaptive gentling algorithms AWST Atsuko is actually quite good.

120
00:08:24,710 --> 00:08:30,470
However if your application can use a simple thresholding algorithm that we saw before Dewes maybe faster

121
00:08:30,470 --> 00:08:32,790
so maybe you should just go for one of those.

122
00:08:32,810 --> 00:08:35,300
Again it depends on the use case and you'll see going forward.

123
00:08:35,300 --> 00:08:39,270
We actually do use different methods in many projects.