CS 7321 Winter 1998
PS#1 Solutions by Tahia Infantes Morris

Character Recognition Problem


How I solved it

        Ugh. That is what I will say. I solved this problem pretty much manually. First, I found an acceptable value to threshold the image to a binary format. This meant A LOT of trial and error because if the letters were too filled-in then it meant that the vowels probably would fit every character (like the 'o') and visa versa if the threshold value was too low.  Next,  I went pixel by pixel in Paint (Win95) and found the pixel locations of almost all the vowels.  For some reason I chose a 10 x 9 matrix to represents the vowels. I made templates of all the vowels and put them in a file.  I then AND'ed all the samples of the letters together to get an acceptable pattern to match all the vowels. This would seem simple enough, but it was time consuming.

        So, once I got a basic pattern for all the vowels, I found the region of the image where there was 'something' and basically started through that region pixel by pixel testing to see if  it matched. I defined areas to search for by going line by line (top to bottom) and checking if it was an empty line. I made sure that there was at least 'half the size of a vowel in  pixels' length before I marked it as an area I could ignore.  Next, and here is the kicker - I could not only just use the foreground pattern for the vowels, I also needed to match the background pattern as well (by inverting the image).  Once again, If I did not do this then I would have vowels all over the place. So, I basically had two representations for each vowel: the foreground and background pattern. Matching the vowel meant ANDing each 10x9 sample of the image to both patterns for each vowel and comparing foreground and background sums to the patterns. If I found a match, then I wrote the topmost x and rightmost y location to a table along with a number from 1 to 5 (representing which vowel had been found). Then I moved over 9 pixels in the image and started searching again.

        Once the entire 'searchable" area had been exhausted, I called a paint program which took the table listing the x, y, and vowel type and basically painted the vowels on top of the image. I got all the vowels to match 'perfectly' on the first and second image.. The third image was well...... a nightmare.

        As I had mentioned, I made my matrix 10x9 (a mistake). I did not realize how important the matrix size was until I had gotten all the characters recognized in the first two images and decided to try the rotate thingy. The problem was that rotating the matrix (by 6 degrees) created matrix dimension problems. It would have been easier if I had chosen 10x10. However, since I chose this 'manual method', it was much too late to go back and change it all. So, in the end, I wrote a routine that basically looks like the main routine, but takes into account the dimension size and rotates each vowel by 6 degrees (that is 60 times!) if a vowel had not been found previously, of course. I tried this ONCE. Why? Well, it took over 40 minutes to run with the rotate stuff in there. I would rather lose points on not getting that to work than to muddle through it again  (especially since it  is 4 am and I have a midterm tomorrow - er, today).

        So, why did I not try some other method? Well, I did. I pretty much implemented most of the algorithms in Chapter 1 and 2 of the book whenever I had a 'brain-storm' on how to solve the problem. I would get into it and then say: "now what?". I tried the "Connected Components" algorithm and got it to work, but this did not help me any as I still had to compare pixels. I then tried the erode and dilate. I guess I am a perfectionist, because I could not get the letters eroded and dilated enough so that each letter was separate from the other characters and still recognizable. In either case, I still had to go pixel by pixel again.  Then I thought about the moments stuff. Ok, so I found the area and the center of mass, I still needed to calculate the same moments for each sample image matrix. All the different attempts would still have me taking sample matrices of the image and processing them to find whether they matched or not. I decided that it would be "quicker' to only pattern match.

        Okay, so I did not completely think things through. I should have stayed with the moments notion so that I could at least identify the rotated characters by the second moment equation.

        So, there. That is how I solved it. The brute force method which took me a week to implement (staying up till mid-morning, believe it or not).  I just wish that I hadn't lost my time trying so many different ways of solving this just to make things "more efficient". <sigh>

        I also must mention the colormap. What I did was basically take the original gray value image and multiplied it by 0.88 so that each pixel value would be remapped and I would have the upper 230-255 region open to make my own colors. It turned out quite well.

        One last note: I am running this at home using Matlab 5.1 on the win95 <eyuck> platform and so I would expect this to run the same on the sgi's and suns, but who knows?


Assumptions and Weaknesses

I made the following assumptions
 
        I  decided that I would have control of the threshold value (which I chose between .73 and .8). If the threshold is changed, the algorithm starts to lose some letters here and there. I also assumed that this program was only to run on these three images. I doubt seriously that it would run successfully on another image, even if the characters were around the same size. I would expect it to find at least half of the characters, but not all of them.

 the major weakness of my solutions are: 

        Obviously, the major weakness in my program is the fact that it is not general enough to apply to other images. The pattern matching is an average of the first and second image. Next, it is time consuming because it is going pixel by pixel through the image. Well, not the entire image, but enough of it to make this program unbearable if it were an entire 8x11'' size image!  However, the run-time is nothing compared to the time it took to manually extract each letter from the images and average them for a pattern.

        I also had some problems with the colormap. Apparently, if the colormap already existed, it would spit some error back about it being a script. My workaround to this was to clear the colormap (my_map) and then run it and change the colormap. If you run the paint_vowels function and get the error, no biggie. Just run the paint_vowels again. It does not take more then 5-10 seconds.

      Oh yeah, another weakness. I never had time to clean up my code and make it more modular. I basically had a long file which did everything and so there were a lot of cut and pasted sections that should have been made into functions. This is not the way I usually program. I just did not have time to clean it up. There are also problems with the patterns as I had to keep tweaking them by adding 0 or 1, so the way I created each pattern has redundant stuff like placing a 1 in a certain location then later filling it in with a 0 and so on.

Improvements and Possible Future Work

I think that this can be improved by doing the following


Results

I know we were to find all the e's then find all the other vowels, but I decided to do it all at once in one program. Therefore, I am only going to show each of the images after my program ran through them.


Source Code