Index
For this problem set, we were to write a MATLAB program that used binary image processing methods to find instances of lower-case vowels in three supplied images. I've done this for each of the five standard vowels ("a", "e", "i", "o", and "u"), for each of the three images, for a grand total of fifteen result images.In each of the result images, I've shown the found vowels I found by using the color red.
The first thing I did was to create a bunch of sample images to use as "templates". I took sample image no. 2 (since it was the largest, and thus had the most letters to choose from), and turned it into a grayscale. After looking at its histogram, and trying various thresholds, I decided to make it into a binary picture by using a threshold of 0.8. With that binary image, I made five samples for each vowel, for a grand total of twenty five. I've included those samples at the bottom of the page.
As part of this, I wrote routines to:
Since I had five samples per vowel, the first run at the logic I did was this. For simplicity, I'm skipping all the image inversions from white-black to black-white, and vice versa, to get the MATLAB routines to work right:
The flaws in the approach showed up pretty quickly. By thinning the samples, I guaranteed a match inside of a genuine match, but it also matched even when the sample just happened to fit inside an arbitrary letter. For example, I got lots of matches of "i" inside of an "l", an "h", etc. All it had to do was fit, which didn't take the background into account at all.
I knew that I had to take the background pixels into account. The next thing that I tried to do was write my own eroder, using this logic for the samples, which I were using as Structuring Elements ("SE"s):
This gave better results, but it was SLOW, as in five minutes for each individual erosion, which obviously wasn't acceptable. Also, I noticed that the background logic still didn't work very well. If I was sloppy with how the sample was done (such as an "e" with lots of space around it), the number of background pixels was too high. This convinced me to write two more routines:
I had established two major things: One, I needed to match all of the foreground pixels using a normal erosion, otherwise the processing would take forever. Second, the background matching should only use the "halo" around a sample, to show what the natural border of the SE was. In other words, only the pixels immediately surrounding a sample, since every pixel further than that is just extraneous information. So, I made another attempt at the eroding logic:
I got the halo of an SE by using the following logic: Wherever there is a black pixel on the SE that has one or two white neighbors, then this is should be considered a point on the halo. Although this approach isn't ideal, the practical results work pretty well. A look at the halo image gives you a good idea of the SE that it originally contained.
The halo approach worked much better than the first two runs did. For example, even if I rotated an "i" by 45 degrees, I didn't have to worry about the large square's worth of background that would result from the rotation of such a thin SE. Only the edges of the "i" would matter.
To improve on the design I have now, I would add "priority" levels to the pixels in the SE. Therefore, not only could you take every pixel on the SE and label it "black", "white" or "doesn't matter", but you could also give each pixel a floating point rating from zero to one. That way, you could define the letter I so that the gap between the body of the I and the dot absolutely had to be there. For example, one possible way to accomplish this would be to make a custom eroder with five pixel types:
The Results.
The Samples That I Used For The Eroding.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Original: |
|
With "A": |
|
With "E": |
|
With "I": |
|
With "O": |
|
With "U": |
|
Original: |
|
With "A": |
|
With "E": |
|
With "I": |
|
With "O": |
|
With "U": |
|
Original: |
|
With "A": |
|
With "E": |
|
With "I": |
|
With "O": |
|
With "U": |
|
The MATLAB Code.
This is a list of links to the actual MATLAB source code that I used to
get the final image files.
model.m | This is the main script, mostly just a driver for calling model_create.m. |
model_create.m | This is the script used to generate each individual image. |
model_eroder.m | This is the custom eroder function. Given a binary source image, and a structuring element, it returns a resulting binary image, using the logic described in Third Attempt. |
model_halo.m | Given a small SE, this returns the "halo" of the SE in white. All the white pixels equal pixels just outside of the shape made by the original SE. |
model_match.m | This is the main engine for the program. It does the tallying of the runs of the five samples, including for each rotation if needed. |
model_readgray.m | A no brainer. This reads in a standard indexed TIFF file, turns it into a grayscaled image, and returns it. |
model_squeeze.m | Given an ST, this trims away background pixels at the border until only the foreground image is left. |
model_threshold.m | Another no brainer. Given a grayscale image and a threshold value, this turns it into a binary image (all pixels are ones or zeros). |
model_unsqueeze.m | This takes an ST and adds back one pixel's worth of background border. Written mainly for model_halo, which needs that border to work correctly. |