Computer Vision Project

Project 1: Image Filtering and Hybrid Images

Example of a right floating element.

Algorithm Implementation

The algorithm was implemented in these main steps:

Pad the image
Loop through each pixel of image
- For each pixel, process boxfilter
Remove padding

1) Padding the Image

Box filtering gets a little weird when trying to filter the corners and edges of an image because a pixels new value is dependent on the pixels around the image. In the case of corners/edges, they don't have pixels completely surrounding it. We add a padding (currently a pad of 0's) to have all pixels in the original image completely surrounded for a some-what accurate filter attempt. The problem when coding this was the dynamic size of filters possible; this means the width/height of the padding will vary.

To figure out the padding, we place the center pixel of the filter on the corner of the original image and figure out how tall/wide the filter extends from the image. This can be done by subtracting 1 from the height and width of the filter and dividing by 2. We pad the image with this found width and height.

2) Process Filter on Pixels

We then loop through the image portion of the padded image. For each pixel we apply our custom boxfilter function. Our box filter takes in 3 parameters: image, pixel coordinate, and filter. What our function does is lay the center of the filter over the pixel, apply the element-wise matrix multiplication, and return the total sum. Because the image has a 3rd dimension for RGB values, the function will perform this operation on all three color dimensions and return a 1x3 matrix of the new RGB values.

While implementing this boxfilter function, I came up with two different approaches. The first approach is the perform a nested for-loop over the submatrix and perform element-wise matrix multiplication on each coordinate. The second approach takes a submatrix of the padded image and uses the element-wise multiplication on the submatrix and filter. This option worked, but it took longer to compute than computing each pixel value one by one. This may have to do with either creating the sub-matrix or using the .* operator.


        function val = boxfilter(image, coord, filter)      % Second Approach
            colors = size(image, 3);
            val = zeros(1, colors);

            c = coord - (size(filter) - 1) / 2;
            c = num2cell(c);
            [rowCorner, colCorner] = c{:};
            [filterRows, filterCols] = size(filter);

            box = image(rowCorner : rowCorner + filterRows - 1, ...
                colCorner : colCorner + filterCols - 1, ...
                :);

            for d = 1 : colors
                ap = box(:,:,d) .* filter;
                val(d) = sum(ap(:));
            end

3) Remove Padding

After applying the filter on the relavent pixels, we then remove the padding by croping the padded image using the pre-computed pad width/height. We can then return this new image.

Results of the Algorithm

The results of the algorithm end up being the same as the imfilter in Matlab. The hybrid images look fine with the low-frequencies when viewing the small image or from far away, but only a couple of the high-frequency images look normal. It seems that images that are completely aligned and take up equal space do well such as the cat/dog and fish/submarine pictures. The large bird/plane image looks weird because we can see the plane outline, but the colors of the bird's low-frequency image pertrudes off the plane's outline, so we can see a strange image in the background. This also applies to the obama image since the high-frequency image doesn't align perfectly with the low-frequency image. The bike/motorbike close-up image also looks strange because the motorbike has a lot of detail that doesn't align up with the colors of the low-frequency bike.

Based on some observations, I believe the images should ideally:

Be perfectly aligned on top of each other
Fill approximately the same space and have similar dimensions
High-frequency/low-frequency image should not have too much detail

Optionally, I also think maybe the image with more/full color should be the low-frequency image because it fills the high-frequency image outline.

As for possible extra credit, I also performed the algorithm on 3 more pictures: a tree tunnel and eye, Mario and Luigi, and smiling Obama and sad Obama.