Project 1: Image Filtering and Hybrid Images


Table of Contents:


Figure 1: Monroe to Einstein

Background Information:

Hybrid Images are images that can be percieved in two alternate ways when viewed from different spacial coordinates. The gif to the right appears to be actress Marilyn Monroe, but as the image moves closer and closer the visual appears to be mathematician and physicist Albert Einstein. Aude Oliva of the Massachusetts Institute of Technology and Phillipe G. Schyns of the University of Flasgow proposed the technique to create a hybrid image in 1994. In 2006, Aude Olivia, Antonio Torralba, and Phillipe G. Schyns published the SIGGRAPH Hybrid Images paper, which is summarized in the following section.


SIGGRAPH Paper Summary:

Definition: A hybrid image is a technique that produces an alternate interpretation for a static image when viewed from an alternate distance. We can state that the interpretation of the image is a function of viewing distance.

Construction: To create a hybrid image, two images need to be superimposed at different spacial scales. One image needs to be filtered with a low-pass filter, a blurring or smoothening filter. This low-pass filter will average out rapid changes in intensity. The other image needs to be filtered with a high-pass filter, also known as shaprening. This filter will allow one to see finer details in the image. Note that a low-pass filter is the opposite of a high-pass filter. Every image taken will have some noise to it. The low-pass filter will smooth out this noise, while the high-pass filter will amplify this noise. Once we have our low-filtered image and high-filtered image, we simply add these two images together.

Mathematical Construction: Let $H$ represent a hybrid image, $I_{1}$ represent image 1, $I_{2}$ represent image 2, $G_{1}$ represent a low-pass filter, and $(1-G_{2})$ represent a high-pass filter. We can then use the following equation to generate our hybrid image:

$H=I_{1}$ $\cdot$ $G_{1}+I_{2}$ $\cdot$ $(1-G_{2})$


Note: There is an additional parameter that adds a gain for each frequency channel.

Example: Let us disect the example Oliva, Torralba, and Schyns display.



$I_{1}$, Elephant


$G_{1}$, Low-Pass Filter


$I_{1}$ $\cdot$ $G_{1}$, Blurred Elephant



$+$


$I_{2}$, Jaguar


$(1-G_{2})$, High-Pass Filter


$I_{2}$ $\cdot$ $(1-G_{2})$, Sharpened Jaguar


$\parallel$


$H=I_{1}$ $\cdot$ $G_{1}+I_{2}$ $\cdot$ $(1-G_{2})$, Resulting Hybrid Image

Figure 2: Elephant to Jaguar


Perception of Hybrid Images:Within 100 milliseconds, a human can look at an image and immediately understand the image. Take a quick glance at the image below:

Figure 3: Apple Macbook Pro

You are easily able to describe this image as a computer, and you might be able to go so far as to describe it as an Apple Macbook Pro.

Schyns and Oliva tested the role of spacial frequency bands in humans. They found that when giving the observer a quick glance (30 milliseconds) of the hybrid-image, the observer indentified the low-spatial scale first. When the observer was given a longer glance (150 milliseconds) they indetified the high-spatial scale first. All in all, humans are quickly able to choose a low or high frequency band when interpretating an image.

Perceptual Grouping of Hybrid Images:There are a few rules to creating hybrid images that are pleasing to the eye:

  1. Arrangement: Choose an arrangement with fewer elements
  2. Symmetrical composition
  3. Proximity: Group nearby segments of images together because they tend to be percieved as a group
  4. Similarity: When objects look similar to each other
  5. Continuation: When the eye is compelled to move through one object and continue to another
  6. Closure: Have enough shape so that people can percieve the whole by filling in the missing information
  7. Color: Color can provide a strong grouping cue
  8. Cut-off Frequencies: Correctly choose the cut-off frequencies for the filters so that there is a clean transition between the two images.
  9. Maximize the correlation between edges in the two scales so that they blend
  10. The remaining edges that do not correlate with the other edges across scales can be percieved as noise.

Note that III,IV,V, and VI are the Gestalt Principles.

Following all of these rules will help insure that one creates an astetically pleasing hybrid image.

Applications of Hybrid Images:Hybrid images have some interesting and useful applications:

  1. Private Font: Show text not visible for people standing at a certain distance.
  2. Hybrid Textures: Crete textures that disappear as a function of viewing distance.
  3. Changing Faces: Change facial expressions, identity, or pose as a function fo viewing distance.
  4. Time Changes: Show two different states of an image such as a house under construction and the house when finished.

Image Filtering:

Image filtering, also known as convolution, is a fundamental image processing tool. Images are filtered to add a soft blur, sharpen details, accentuate edges, or remove noise. We can use linear filtering operators, which involve weighted combinations of pixels in small neighborhoods. For this linear filtering, there is an important operation called convolution. Linear filtering is the most common type of neighborhood operator. To determine an output pixel's value we need to look at the sum of the input pixel values. The idea is to look at a neighborhood of pixels, multiply by a matrix of weights for each pixel, and produce an output pixel as shown in the diagram below where $f(x,y)$ represents our image, $h(x,y)$ represents our weight kernel or mask, and $g(x,y)$ is our output pixel value:

Figure 4: Linear-Filter

The above image depicts neighborhood filtering or convolution. We let $f(x,y)$ be our image and let $h(x,y)$ be the mask of the filter. We place the mask of the filter on top of one of the given pixels. So, in the figure above, 0.2 is the center of the mask and we want to apply the convolution on pixel 96. We see that the sum of the weights in the mask sum to 1 and that the neighbors, the 0.1 weights around the 0.2 weight, are less important due to their lower weighting. To get our output pixel, we have:

$65$$\cdot$0.1+98$\cdot$0.1+127$\cdot$0.1+65$\cdot$0.1+96$\cdot$0.2+115$\cdot$0.1+63$\cdot$0.1+91$\cdot$0.1+107$\cdot$0.1$ = 91.9\approx92$

The value 92 is represented by the green pixel in $g(x,y)$. We repeat this operation for every pixel on the image and $g(x,y)$ will be the resulting image after we apply that operation.

It is also important to pad our image before we perform convolution. This is to ensure that we perform the neighborhood filter operation on every pixel in our image. The original image is effectively padded with 0 values wherever the convolution kernel extends beyond the original image boundaries. There have been a number of padding options that have been developed such as:

There are many possible filters as shown below:

Figure 5: Filters

Above, we see the corresponding mask for each filter. Let us examine the Gaussian filter. We see that the center pixel (36) is the pixel that we want to make the convolution, the center 36 pixel value is more important the the neighbors around it. We see the four corner pixels will have a weight of 1/256, meaning they are the least important. When we take the summation of every weight in the mask, we will get the value of 256.

In Figure 5, the box filter (a), bilinear filter (b), and gaussian filter (c), are examples of a low-pass filter. They pass through lower frequencies white attenuating higher frequencies and thus blur or smooth the image.There is an advantage of using a gaussian filter, which is that it has the same shape in the Fourier and spatial domains. This is because the gaussian filter is its own Fourier Transform. If we used a different low-pass filter we might incur a ringing effect for the filtered image in the spatial domain. Gaussian blur is the same as convolution of the image and a Gaussian function, where the function is of the form:

$G(x)={\frac {1}{\sqrt {2\pi \sigma ^{2}}}}e^{-{\frac {x^{2}}{2\sigma ^{2}}}}$

in one dimension and

$G(x,y)={\frac {1}{2\pi \sigma ^{2}}}e^{-{\frac {x^{2}+y^{2}}{2\sigma ^{2}}}}$

in two dimensions.

So, the linear filter is a neighborhood operator in which the output pixel value is determined as a weighted sum of input pixel values. We can write this as:

$g(i,j)=$ $\sum_{k,l}f(i+k,j+l)h(k,l)$

where f is our image, h is our weight kernel or mask, and g is our resultant image. We can more compactly write this as:

$g = f*h$


Matlab my_imfilter Implementation:

With all the above content in mind, let us now construct a linear filter function that acts like the imfilter function in matlab.

Parameters: Our function will take in a parameter called image which is any matrix from an image and a parameter called filter which is an $n$ x $m$ matrix where $n$ and $m$ are both odd.

Handling Color: The next thing we need to do is determine whether the image we have is a color image or a grayscale image. We can easily check this through the dimension of the matrix. If the dimension is 3 then it is a color image, else it is a grayscale image.

Padding: Now, we need to pad our image. To do this we can look at the row and column size of our filter. We will let the left and right pads be equivalent to floor function of the number of rows in the filter divided by 2. Similarly, we will let the top and bottom pads be equivalent to the floor function of the number of filter columns divided by 2. We can then create our padded image through the help of a matlab function called padarray where we have a parameter called symmetric. This will pad the array with mirror reflections of itself. We use the following to complete the construction of our padded array:


padded_image = padarray(image,[lr_pad tb_pad]);

Convolution: Next we perform convolution. We need to slide our filter over our padded image matrix and perform element-wise multiplication between the two. Then we need to take the summation of the elements and add them to an array that is the size of our image. We create a nested for loop where $i$ indexes the rows and $j$ indexes the columns. The covolution step in code is as follows:


        for i = 1:pad_rows-fil_rows+1
            for j = 1:pad_cols-fil_cols+1
                arr(i,j) = sum(sum((padded_image(i:i+fil_rows-1, j:j+fil_cols-1).*filter)));
            end          
        end

Return Array: We then return the convolution array as our output and we have our filtered image which is the same resolution as the input image.

Code: Out of the above steps, two inner functions are created. One to check and turn the dimension to the chanel filter and the other to run the channel filter function.


function output = my_imfilter(image, filter)
op = zeros(size(image)); 
op = dimension2Chanel(image,filter,op);
output = op;

    %An inner function to compute if an image is color or grayscale and
    %send it to the chanel_filter the appropriate number of times
    function op = dimension2Chanel(image,filter,op)
        dimension = length(size(image));
        for dim=1:dimension
            op(:,:,dim) = chanel_filter(image(:,:,dim),filter);
        end 
    end 
    

    %Inner function that contains the linear filter function
    function arr = chanel_filter(image, filter)
        arr = zeros(size(image));
        [im_rows im_cols] = size(image); %rows is height, cols is width
        [fil_rows fil_cols] = size(filter);
        tb_pad = floor(fil_cols/2); 
        lr_pad = floor(fil_rows/2); 
        padded_image = padarray(image,[lr_pad tb_pad],'symmetric');
        [pad_rows pad_cols] = size(padded_image);
        for i = 1:pad_rows-fil_rows+1
            for j = 1:pad_cols-fil_cols+1
                arr(i,j) = sum(sum((padded_image(i:i+fil_rows-1, j:j+fil_cols-1).*filter)));
            end          
        end
    end 
end 

Comparison: Let us now compare our filter function with matlab's imfilter function:

Gaussian Filter, [25,1], 10



Original


my_imfilter


imfilter






Original


my_imfilter


imfilter

We see that our matlab fuction my_imfilter is very similar to the built in matlab function imfilter.

Hybrid Image Construction:

Now that we have our image filter function, we can proceed to constuct our hybrid image.

Image Size Check: If we pass two images into our filter they might be different sizes and we might incur an error that states, "Array dimensions must match for binary array op.' This means that we can't sum the two matrices because they are of different sizes. The code below let's us compare two matrix sizes and turn the size of the larger matrix into the size of the smaller matrix so we have two images of equal sizes:


if size(image1) > size(image2)
    s = size(image2)
    rows = s(1)
    cols = s(2)
    image1 = imresize(image1, [rows cols])
else 
    s = size(image1)
    rows = s(1)
    cols = s(2)
    image2 = imresize(image2, [rows cols])
end 

Image Alignment:We also have to make sure that our images are aligned so that we can make a nice hybrid image. To do this we can use an extrenal tool such as photoshop or find two images that are similar in alignment.

Filter Creation:Next, we need to create our filter. This is the matrix that we will convolute the images with. We will construct a Gaussian filter with size [cutoff_filter*4+1 cutoff_filter*4+1] and standard deviation $\sigma =$ cutoff_frequency. We note that the cutoff frequency is the standard deviation, in pixels, of the gaussian blur that will remove the high frequencies from one image and the low frequencies from another image. A lower cutoff frequency will make the image blurrier since only low frequency components of the image are passed. Similarly, a high cutoff frequency will make the edges sharper in the image. This cutoff frequency changes with respect to our images chosen. Sometimes we will need a lower cutoff frequency, other times we will need a higher value for this parameter. It takes some parameter tuning to find a nice cutoff value that works well with our images. To construct the filter we do the following:


cutoff_frequency = 6;
filter = fspecial('Gaussian', cutoff_frequency*4+1, cutoff_frequency);

Low-Frequency Image: To create our low-frequencies, we simply call our my_imfilter function on our image1 with our created filter. The code looks as follows:


low_frequencies = my_imfilter(image1,filter)

High-Frequency Image: To create our high-frequencies we blur our image2 and then take the original image2 and subtract off the low-frequencies. The code looks as follows:


blurred = my_imfilter(image2,filter)
high_frequencies = image2 - blurred

Hybrid Image Creation: We now simply need to sum image1 and image2 together to get our hybrid image. The code looks as follows:


hybrid_image = low_frequencies + high_frequencies

Examples:

Let's take a look at some examples of some hybrid images we can create with our newly created functions:

Example 1: Dog & Cat

cutoff_frequency=7



image1


image2


low frequencies


high frequencies


Hybrid Image



Hybrid Image Scale


Example 2: Motorcycle & Bike

cutoff_frequency=3



image1


image2


low frequencies


high frequencies


Hybrid Image



Hybrid Image Scale


Example 3: Fish & Submarine

cutoff_frequency=7



image1


image2


low frequencies


high frequencies


Hybrid Image



Hybrid Image Scale


Example 4: Bird & Plane

cutoff_frequency=4



image1


image2


low frequencies


high frequencies


Hybrid Image



Hybrid Image Scale


Example 5: Monroe & Einstein

cutoff_frequency=3



image1


image2


low frequencies


high frequencies


Hybrid Image



Hybrid Image Scale


Example 6: Drake and Bieber

cutoff_frequency=6



image1


image2


low frequencies


high frequencies


Hybrid Image: Monroestein



Hybrid Image Scale


Example 7: Kanye and Jay-Z

cutoff_frequency=8



image1


image2


low frequencies


high frequencies


Hybrid Image: Monroestein



Hybrid Image Scale


Example 8: Zain and Bieber

cutoff_frequency=6



image1


image2


low frequencies


high frequencies


Hybrid Image: Monroestein



Hybrid Image Scale


Example 9: Pharrell and Lil Wayne

cutoff_frequency=6



image1


image2


low frequencies


high frequencies


Hybrid Image: Monroestein



Hybrid Image Scale


Algorithm Results and Extras:

My algorithm for my personal imfilter function was very comparable to matlab's own imfilter function. Upon comparison, the results look almost identical. Naturally, there is some slight variation but that can be attributed to rounding decimals and such. The hybrid images also looked great. I used a wide variety of recognizable celebrities and altered the cutoff frequency so that the images had a smooth transition from one image to another. When constructing the hybrid images you need to add the images together, but this cannot be done unless the images are the same dimension. I decided to add an additional set of conditionals to make the images the same size by scaling the larger image down to the size of the smaller image. This will allow us to add any two images into proj1.m and compute a hybrid image for them.

There were some complications when running the hybrid image generator. I personally believe that the biggest challenge was finding images that have similar colors and a nice overlap. Initially I had a hard time of finding these images and the images I was computing the hybrid image for were looking sub par. There was no nice transition between images. However, after I found celebrity photos and altered their allignment with the program GIMP (free photoshop), the transition seemed much smoother and the hybrid image looked pleasant. With this project you really notice that convolution is a huge process. If the image and filter are large, convolution can take a great amount of time to run as there are millions of computations that need to be done.

I added a few extras into my project. I not only used the stock photos in the data subdirectory, but searched for my own images and altered their appearance with GIMP and displayed their hybrid images on the HTML page. I also added a javascript feature where you can scroll over the Hybrid Image Scale in the examples and it becomes larger for you to see. When you move your mouse off the image it rescales to the original size. This feature is also implemented with the images in the Matlab my_imfilter Implementation section under comparison. I thought that this would make the images easier to see and keep the presence of the page clean. Additionally, I provided a summary of the SIGGRAPH paper for anyone who views this webpage and has no prior knowledge of hybrid images. I provided background information about hybrid images, a high-level overview of their construction, a mathematical version of the construction, an example used in the paper, some bullet points on how to construct nice looking hybrid images, and some useful applications of hybrid images.

I had a great amount of fun with this project and created and tested many different hybrid images. I had fun creating hybrid images of my friends, celebrities, famous paintings, and random pictures together.

References

  • Images
  • Content