Computer Vision Project

Here we showcase some examples of linear filtering used to construct hybrid images. The idea behind this filtering process is to create a low-pass version of one image (with low frequencies) and a high-pass version of another image (with high frequencies). We can combine these two images to form a single image that seems to take on similarities to both of the original images. We create low frequency images by blurring with a Gaussian blur filter and high frequency images by subtracting a Guassian blurred version of the image. From there, it is simple addition to create our hybrid image. We will be exploring multiple image pairs and try to understand what exactly makes a good hybrid image. The general goal is to come up with a hybrid image that appears to "morph" into the other image by the 2nd or 3rd smallest scaled image (we will be displaying the hybrid images as a series of scaled copies).

One implementation decision for the filtering is the type of padding performed along the edges of the image. This padding is necessary so that we have a neighborhood to filter when working along the edges of the actual image. Below, I show two padding approaches: padding the image with 0s and padding the image with reflections of the image.


% Excerpt from code: padding options
image = padarray(image, [filterRowCenter-1 filterColCenter-1]); % 0 pixel padding
image = padarray(image, [filterRowCenter-1 filterColCenter-1], 'symmetric'); % Reflection padding

From left to right, we have the original image, a blurred version of the image after being 0 padded, and a blurred version of the image after being padded with reflected pixels. The simple blur filters above clearly show the different behaviors of the two padding approaches. When constructing hybrid images, these differences are rather difficult to make out (although it certainly depends on the images being used). The banding seen above is generally undesirable, so we will use the reflection technique for the rest of the report.

CatDog

These first two images above are our original images: a cat and a dog.

These four images include two high and low pass image pairs, each pair having our high frequency cat and our low frequency dog. We used a cutoff frequency, or standard deviation (we will refer to this as c from now on) of 14 for the first pair and 7 for the next pair to show how c alters the Gaussian blurs. Our paremeter c is We will be tweaking the cutoff frequency throughout the report to try to get an intuition for what the value alters in terms of the images. As shown above, a lower c roughly lessens the blur of the low frequency image and increases the "fade" of the high frequency image. It makes sense that this relationship is opposite within the image pairs since we are subtracting the Gaussian blur to produce high frequency images.

Here, we have created hybrid images with c = 7 and c = 5, respectively. Our cat and dog image pair works extremely well! We can clearly see the transition from cat to dog at scaled image 3 for c = 7 and at scaled image 2 for c = 5. It appears that the lower the c, the quicker the scaled image morphs into the other layered image.

So what happens if we instead use a high frequency dog and a low frequency cat? It turns out our hybrid image is harder to make out. But why? There are a lot of factors that go into making an effective hybrid image, with one primary factor being image alignment, both in terms of the object and the color. For example, in the large scaled hybrid image above, we can make out the dog's facial features, but there is a distinct lack of coloring on the nose to indicate its existence. In the previous hybrid image, the dog's nose coloring simply appeared as a normal looking spot on the cat, giving us the perception that it could still indeed be a cat. We have lost the dog's coloring which gave away its visual facial indicators and are instead left with just the cat's colors (which does end up give us an interesting view of this dog in the cat's colors). We have to be a bit clever (or lucky) when it comes to creating a successful hybrid image.

SubmarineFish

Our next example includes a submarine and a fish.

Our hybrid images above use c = 2 and c = 4, respectively. These lower c values cause the image to morph pretty well into the submarine (with lower c values morphing quicker), but it is quite clear that the large hybrid image is relatively poor at depicting the fish, mostly due to the submarine's top extension. This is a good example of how important image alignment is.

Here are our high (fish) and low (submarine) frequency images from c = 2. From our previous cat and dog example, we can see that the high frequency cat retained a little bit of its coloring even for c = 7. Our low c value for the fish and submarine pair makes us lose most of the fish's coloring, causing our hybrid image to be quite dull in color (as dull as the submarine image).

Swapping the images (and using c = 7) preserves the color of the original fish image. This actually causes an interesting effect; it almost seems that the fish (or something) is encased in the submarine. This surprisingly works well; the fish's coloring seems like a design on the submarine. Interestingly, we need a higher c value here due to the dullness (i.e. the submarine's color already blends into the color of the water) of the submarine. Lower c values cause the submarine to be too transparent to see over the fish.

MotorBike

In this example, we attempt to combine a bicycle with a motorcycle.

Here we have c = 4 and c = 10, respectively. We seem to suffer a lot from the misalignment of the images; the large scaled image is pretty messy. Making out the motorcycle is difficult over the intense blur of the bicycle which further emphasizes the misalignment of the object shapes. As is expected, the lower c value causes the image morph to occur more quickly.

Swapping the images gives us a much more interesting result (c = 7). We get to see much more color due to the prevervation of the motorcycle's colors, including a mysterious blue blur along the bicycle. However, we are still ultimately left with a weak result due to the difference in shapes between the bicycle and motorcycle. It is certainly better than our previous example, though.

Above, we have our low (motorcyle) and high (bicycle) frequency images for c = 7. We can see that the unique blue coloring comes from the high frequency filtering of the bicycle, and the orange color is retained from the motorcycle. The blue banding is the first time we have seen a new color emerge from the frequency pass.

MarilynEinstein

Just like the cat and dog example, our Marilyn and Einstein pair is very promising. The similarity in color and image alignment give us great potential for hybrid imaging.

Above, we have c = 4 and c = 7. Our c = 4 example suffers mostly from premature Einstein morphing (especially note the tie forming over Marilyn's throat). Our c = 7 causes more Einstein blur, giving us less interference over Marilyn's image. Both examples clearly morph to Einstein at least by the 4th image. Since Marilyn's image is so white and Einstein's has a lot of black, there is some contamination especially around the lower half of the image.

Here, we swap the images and use c = 5. This swap takes care of Einstein's black contamination and ends up producing a very clean result. This is probably the 2nd best hybrid image in this report, behind the cat and dog example. The third scaled image is a little disturbing as it appears as almost a 50/50 combo of Marilyn and Einstein. It appears as though either Marilyn has grown a mustache or Einstein is rocking some curls. One reason for this example's success seems to be the lack of color. Color seems to be perceptually distracting, often taking away from perceiving the two images within the hybrid image.

BirdPlane

Our next example consists of a bird and a plane. We can clearly see that the bird's head does not line up with the jet's nose. This makes it a bit more difficult.

These images follow c = 5 and c = 9, respectively. With c = 5, we seem to effectively morph into a bird by the 4th image, but suffer from a slightly transparent jet in the large scaled hybrid image. Filling out the jet with c = 9 unfortunately makes it much harder to discern the bird. This is definitely one of the weaker examples we have encountered so far.

This is a c = 4 image swap. As is expected, this produces very poor results since the bird's head is shorter than the jet's nose. This makes it impossible to hide the jet's nose with the bird's head, causing a bit of a mess in the hybrid image.

LegendOfZelda

In trying to think of what custom images I could try, I realized that video game box arts could be something interesting to explore due to their identical shapes and shared icons. Here, I've taken box arts from my favorite video game series, The Legend of Zelda. Fortunately, their titles are similar, so hopefully we can get some good hybrid images.

Our high and low pass images at c = 14 appear to preserve a lot of the color. This is great, as we should see some interesting colorful displays.

Combining our c = 14 low and high pass images produces a very interesting blend of the two original images. We have effectively blended the red bar and gold background of the Ocarina of Time box with the Majora's Mask box. With such a high c value, our hybrid image does not necessarily morph images, but rather acts as a very interesting static result. Now we know that this hybrid filtering not only produces images with multiple interpretations; it also can produce some interesting art.

The image pair above had some variation in the box art layout; let's try to get a morphing hybrid image with the same red bar layout.

This result comes from c = 6. Our lower c value doesn't show as clean of a result as the example above, but it becomes more effective at morphing between the original images. The last two scaled images clearly resemble the Ocarina of Time box art. The misalignment of the sword is the main inhibitor of the hybrid image; the blur is quite jarring.

Here we swap the images with c = 13. The vivid colors of the Majoras Mask box are evident throughout this hybrid image, causing an aesthetically pleasing result. Surprisingly, we can actually see the mask start to appear in the 2 smallest scaled hybrid images, even with such a high c value. I could see this hybrid image process as an artistic tool in exploring more combinations of artistic pieces.

It is clear from experimentation that some pairs of images simply form "better" hybrid images. Important parameters include image colors, image alignment, which image is the high/low pass image, and the cutoff frequency. In some situations, the padding method could also influence the "goodness" of the hybrid image. It is clear that this simple approach to hybrid image creation even has potential to create new works of art if enough exploration and tweaking takes place. Although a lot of things need to be "just right" to produce an effective hybrid image from a pair of images, a good result is highly aesthetic and satisfying.