File size: 12,485 Bytes
17e2002 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 | 1
00:00:01,360 --> 00:00:06,950
OK so let's not talk about trolling and dribbling is actually a very important open C-v function that
2
00:00:06,950 --> 00:00:12,100
is quite useful and you'll find we use it in so many of for many projects going forward.
3
00:00:12,120 --> 00:00:17,190
So what exactly is troubling troubling is converting an image to its binary form.
4
00:00:17,420 --> 00:00:19,520
But what does that mean exactly.
5
00:00:19,520 --> 00:00:25,880
So if you look at the function DCV to the Trishul function this gives you an idea of what to expect.
6
00:00:25,880 --> 00:00:28,640
So imagine we have the input image here.
7
00:00:29,050 --> 00:00:35,350
It Trishul value which is very important a max value and a threshold type.
8
00:00:35,440 --> 00:00:38,980
So I'm just going to quickly illustrate what happens in this function here.
9
00:00:39,250 --> 00:00:41,540
So let's look at his gray scale image here.
10
00:00:41,750 --> 00:00:46,760
And also please note all Trishul functions use grayscale images so images need to be converted.
11
00:00:46,810 --> 00:00:47,870
Integral scale before.
12
00:00:47,890 --> 00:00:49,200
Thresholding here.
13
00:00:49,750 --> 00:00:51,180
So this is grayscale image.
14
00:00:51,190 --> 00:00:55,410
It goes all the way from 0 to 255 at the bottom here.
15
00:00:55,810 --> 00:01:00,840
Let's assume this is a wheel midway point 127 that's over Trishul value.
16
00:01:01,210 --> 00:01:07,160
So and max value here is a value we want to set everything above that threshold.
17
00:01:07,390 --> 00:01:16,800
So in this example if this line here is 1:27 everything above it when we are going to 2:55 becomes 255.
18
00:01:16,870 --> 00:01:22,850
So this portion here becomes white and this push in above here becomes black which is zero.
19
00:01:23,380 --> 00:01:28,450
So it is different forms of travels which we see in the code but that just illustrates exactly what's
20
00:01:28,450 --> 00:01:29,410
happening here.
21
00:01:29,770 --> 00:01:35,140
There's also a type of Trishul and call adaptive thresholding which will come to an accord as well.
22
00:01:35,530 --> 00:01:40,780
And you may have noticed I used the word binary station here biner is Ishan or troubling refusing to
23
00:01:40,780 --> 00:01:48,720
see him doing it means converting to a fully greyscale image into just two points Max point and a loop
24
00:01:48,740 --> 00:01:53,320
when traditionally it's zero for a little point and to 55 for the max point.
25
00:01:53,320 --> 00:01:55,480
So white or black.
26
00:01:55,570 --> 00:01:55,860
OK.
27
00:01:55,870 --> 00:02:00,400
So let's take a look at implementing some of those Trishul methods or we just discuss.
28
00:02:00,400 --> 00:02:02,970
So as you can see it has five different thresholding methods.
29
00:02:02,970 --> 00:02:11,490
We're going to use here is binary binary inverse truncation Trisha's 0 tresses trash to zero inverse.
30
00:02:11,500 --> 00:02:14,940
So those are undiscovered and we can actually figure out exactly what each is doing.
31
00:02:17,260 --> 00:02:22,900
Or just to know that I actually said to Windows specifically here it isn't an open civies and so nice
32
00:02:22,900 --> 00:02:24,910
to automatically or do those windows.
33
00:02:24,990 --> 00:02:28,430
There are some C-v to move into functions that allow you to do that.
34
00:02:28,450 --> 00:02:30,560
But I'm not going to discuss that right now.
35
00:02:30,910 --> 00:02:32,470
So anyway back to thresholding.
36
00:02:32,560 --> 00:02:34,520
So this is our original image here.
37
00:02:35,140 --> 00:02:40,500
And remember from the slides I said we said 127 to be the Trishul value.
38
00:02:40,840 --> 00:02:46,230
So in Trishul binary everything that's below 127 which is of value be used in our code.
39
00:02:46,480 --> 00:02:51,480
Hopefully you remember that goes to black and everything above that goes to white.
40
00:02:51,520 --> 00:02:57,160
So Trishul binary Inv. actually and undecidedly does it do.
41
00:02:57,200 --> 00:02:59,310
Opposite the inverse of that.
42
00:02:59,380 --> 00:03:06,670
So everything docket and 1:27 goes to white and everything higher than 127 goes to black.
43
00:03:06,760 --> 00:03:10,650
Now trust trunk there's something a little bit different.
44
00:03:10,720 --> 00:03:16,360
What it does is that it takes everything that's actually brighter or higher than 1:27 going to Swee
45
00:03:17,080 --> 00:03:24,890
and Cupps it at 127 which is why everything just stops a disgrace here and continues along the screen.
46
00:03:25,240 --> 00:03:30,520
So what about G-0 0 0 actually does something of the opposite here.
47
00:03:30,880 --> 00:03:37,620
Everything that's actually below 127 actually gets capped to black similarly to binary here.
48
00:03:37,690 --> 00:03:40,770
However it leaves everything above 127 Lewan.
49
00:03:41,260 --> 00:03:46,490
So it's a nice effect can be useful sometimes and Trisha's 0.
50
00:03:46,550 --> 00:03:49,220
Invis actually does the opposite.
51
00:03:49,690 --> 00:03:53,950
So as you can see it maintains this color here.
52
00:03:53,950 --> 00:04:01,330
This is the initial 0 to 127 part but everything above 127 just like in Trishul binary invis goes to
53
00:04:01,330 --> 00:04:03,430
black.
54
00:04:03,530 --> 00:04:05,280
So that's close these windows now.
55
00:04:05,560 --> 00:04:11,730
And we can actually just quickly look at it could you remember the input images of Fu's here.
56
00:04:12,080 --> 00:04:16,750
57
1:27 is a Trishul value 255 is the max value set.
58
00:04:16,760 --> 00:04:20,450
That's de-valued cause too when it's above that Trishul or vice versa.
59
00:04:20,900 --> 00:04:25,320
Zero is actually used in the opposite case for most of these Tresor log algorithms.
60
00:04:25,730 --> 00:04:29,650
And this here fresh binary This is where we defined threshold type.
61
00:04:29,650 --> 00:04:30,930
We want to use.
62
00:04:30,950 --> 00:04:34,760
So it's a simple argument function CB2 Trishul.
63
00:04:34,880 --> 00:04:36,400
Hope you have fun using it.
64
00:04:37,870 --> 00:04:43,750
So you may have fun with that one weakness of these Tressell algorithms is ID all require us to set
65
00:04:43,750 --> 00:04:47,710
the value at 127 or whatever value we wish to use.
66
00:04:47,710 --> 00:04:54,370
No that's not convenient when you're actually doing it with Lexie's scanned documents or anything of
67
00:04:54,370 --> 00:04:55,130
the sort.
68
00:04:55,420 --> 00:04:57,430
So is there a better way of doing this.
69
00:04:57,430 --> 00:04:58,320
And yes it is.
70
00:04:58,330 --> 00:05:02,300
It's called adaptive thresholding and I have a slide in a presentation.
71
00:05:02,310 --> 00:05:03,890
We'll discuss these methods here.
72
00:05:04,210 --> 00:05:06,840
But there's quite a few adaptive thresholding methods.
73
00:05:06,850 --> 00:05:16,780
Use an open C-v is adaptive mean trash is OS2 his method which is quite popular and is Gosia or Austin's
74
00:05:16,780 --> 00:05:17,490
method.
75
00:05:17,860 --> 00:05:21,400
So let's run this code and we can actually see what's going on here.
76
00:05:22,210 --> 00:05:24,530
So this is the original image we're looking at.
77
00:05:24,530 --> 00:05:26,880
It's Charles Darwin version of species.
78
00:05:27,310 --> 00:05:29,790
So that's the original image.
79
00:05:29,830 --> 00:05:36,020
This here is a regular Trishna binary Trishul that previously used as you can see it's good.
80
00:05:36,100 --> 00:05:42,640
It's not bad but if you wanted to get this text book we would have had to use maybe a low or high value
81
00:05:42,640 --> 00:05:45,690
of 127.
82
00:05:45,720 --> 00:05:50,880
So this is it here with adaptive martially as you can see you can compare directly to this image and
83
00:05:50,880 --> 00:05:53,140
you can see the text is a lot more clear.
84
00:05:53,460 --> 00:05:57,490
So it setting a Trishul a value of 127 or anything of the like.
85
00:05:57,600 --> 00:05:59,520
It actually does a better job already.
86
00:05:59,820 --> 00:06:02,300
So we can all his method do a better job.
87
00:06:02,850 --> 00:06:03,480
Perhaps it can.
88
00:06:03,480 --> 00:06:05,810
I mean there's not much difference separating these images.
89
00:06:06,060 --> 00:06:09,670
But in my experience his method actually is quite handy.
90
00:06:09,690 --> 00:06:11,740
What about Astor's Gaussian method.
91
00:06:12,840 --> 00:06:17,280
Again this actually looks very similar it is on the onto the differences between these images here.
92
00:06:17,490 --> 00:06:27,370
However in your own image data set you may want to actually play and try these different logarithms.
93
00:06:27,430 --> 00:06:31,170
So let's listen a bit more about the adaptive treacly methods that we just saw.
94
00:06:31,450 --> 00:06:33,750
So the first one we saw was adaptive Trishul.
95
00:06:34,120 --> 00:06:35,010
And as you can.
96
00:06:35,050 --> 00:06:39,520
As you remembered adaptive turtling has a big advantage in that it reduces the uncertainty in setting
97
00:06:39,520 --> 00:06:40,630
a Trishul value.
98
00:06:40,840 --> 00:06:46,600
So the input does nutritional value in these parameters here is a max value which is 2:55 the adaptive
99
00:06:46,600 --> 00:06:50,700
type Trishul type block size which needs to be in odd numbers.
100
00:06:50,710 --> 00:06:57,390
One of those weird pensively quirks and a constant and lot of open civic work that needs to be said
101
00:06:57,400 --> 00:07:02,230
that faith Well traditionally five you can experiment with different values and if you get better results
102
00:07:02,410 --> 00:07:02,940
that's good.
103
00:07:03,900 --> 00:07:10,820
So that's quickly going.
104
00:07:10,840 --> 00:07:11,950
So these are the values here.
105
00:07:11,950 --> 00:07:16,360
So this is the adaptive type and I believe this is the only type available for dysfunction right now
106
00:07:17,140 --> 00:07:20,710
and this is the troubling type we use here as well.
107
00:07:20,710 --> 00:07:25,800
And these are the blocks and the constant here and that give us the results we saw earlier.
108
00:07:25,930 --> 00:07:27,480
So let's go back to a presentation now
109
00:07:30,720 --> 00:07:35,490
and they actually just misspoke when I just said adaptive trust means see was the only adaptive threshold
110
00:07:35,490 --> 00:07:42,690
type we could use in this function is actually adaptive Shesh Gaussians see which as you may remember
111
00:07:42,690 --> 00:07:46,930
is a weighted sum of neighborhood pixels under that Gaussian window.
112
00:07:46,940 --> 00:07:51,860
So let's discuss OS2 which is actually one of the best adaptive freshly methods.
113
00:07:51,890 --> 00:07:54,510
So what OS2 does is that it tries to find.
114
00:07:54,590 --> 00:07:59,360
Well it looks at the histogram of the range of intensities in the image here and it tries to find the
115
00:07:59,360 --> 00:08:05,930
peaks or the moods of that of the histogram here and then it actually finds a value that separates these
116
00:08:05,930 --> 00:08:06,770
histograms.
117
00:08:06,770 --> 00:08:14,100
The best value I can separate is two histograms that you actually get the best thresholding value here.
118
00:08:14,750 --> 00:08:19,490
So in most cases where you have changes in light intensity I would encourage you to use one of these
119
00:08:19,490 --> 00:08:24,690
thresholding adaptive gentling algorithms AWST Atsuko is actually quite good.
120
00:08:24,710 --> 00:08:30,470
However if your application can use a simple thresholding algorithm that we saw before Dewes maybe faster
121
00:08:30,470 --> 00:08:32,790
so maybe you should just go for one of those.
122
00:08:32,810 --> 00:08:35,300
Again it depends on the use case and you'll see going forward.
123
00:08:35,300 --> 00:08:39,270
We actually do use different methods in many projects.
|