Problem Set 2 
Stereo Matching Cost Computation

Track 2: CS 4495

CS 4495/7495
Computer Vision

Due by midnight Sept. 19, 2003

In this assignment you will implement a function to compute the costs involved in stereo matching. This is the first part of a three-part stereo assignment. Code produced during this assignment will be used in future ones (solutions will also be provided). You may find the supplemental readings on stereo listed on the course web site to be helpful, but they are not required. In this problem set it may be helpful to use some of the image processing functions in the OpenCV library.

Grading: This is the undergraduate level version. The answers from the students in 4495 will determine the mean for this assignment track. Any students from 7495 who do this version of the assignment will be graded on a much tougher curve (i.e. we will assume that you are going to get everything right).  The majority of points will go towards implementing and testing the function that computes the SSD.

1. A fundamental issue in stereo matching is selecting the matching cost function which measures the similarity between two N by N windows of pixels (where for convenience, N is an odd number). A popular choice is the normalized correlation cost function NC(IL,IR), which can be viewed as an inner product between two vectors of pixels, IL and IR. Prove that if IL and IR are two image patches related by IR = a * IL + b for any two scalar constants a and b, then NC(IL,IR) = 1.

2. An alternative to NC is a raw (un-normalized) SSD cost function, which computes the sum of squared intensity differences between two windows using the original image pixels. One potential advantage of this cost function is that it can be computed very efficiently (see the Track 1 problem set for the details if you are interested). Let the function SSD(x, y, d) denote the SSD value at pixel (x, y) for disparity d (and a window size of N).

Consider the following stereo pair:

Left:

187

200

220

180

190

200

220

180

190

206

220

180

190

206

194

180

190

206

194

158

190

206

194

158

180

 

 

 

Right:

198

217

176

185

200

217

176

185

200

187

176

185

200

187

150

185

200

187

150

171

200

187

150

171

201

 

 

.

(a) With a 3 x 3 matching window and a disparity of one pixel, compute the corresponding SSD image for the image pair above. Complete the table below with your answer (some of the values have already been provided)

B

B

B

B

 

B

B

27

36

 

B

B

36

   
         
         

 

 

,

where B denotes unknown border pixels whose values cannot be computed. 

(b) In general, how will the number of B values at the borders of the SSD image depend upon the window size N and the disparity d?

3. Write a function compute_ssd that takes as its input a stereo pair of images and a disparity value and outputs an image corresponding to SSD(x, y, d). It's easiest if this is a floating point image so there are no issues in representing the full range of possible SSD values. The OpenCV image class supports the creation of these types of multi-band, floating point images and provides some useful image processing functions. You can use any correct method for calculating the SSD values. The straight-forward approach of looping over all windows of pixels in all positions can be expensive for large images, so you should test your code on small ones during debugging. If you are interested in exploring more efficient algorithms, you might try the box filter technique from the Track 1 version of the problem set.

You will have to address the issue of boundary conditions (i.e. near the image boundary the window would extend outside the image). This can be a problem particularly when searching over large values of disparity. The easiest solution is to limit the range of x, y values you process, effectively truncating the output image as in question 2(a). The output image will then contain a border of pixels for which no answer was computed. Set these to some default value (e.g. 0).

4. Write a routine to display an image in a window by first mapping it into a range of grey scale values. The routine should work for pixel values that are floating point, negative, etc. You will need this to display SSD images and disparity maps so that you can visualize them. Test your display code on a known image and then use it to visualize SSD images and test your code from part 4. You may want to use the test pair: [test-left.bmp, test-right.bmp]. Correct disparity images are provided for d=5 and d=15 for testing purposes.

5. You will run your SSD code on the following stereo pair: [sawtooth-left-small.bmp, sawtooth-right-small.bmp]. The sawtooth image is a "synthetic" image which was taken from the Middlebury Stereo Vision Page and is described in the paper by Scharstein and Szeliski. You may also want to examine the pair [tsukuba-left.bmp, tsukuba-right.bmp], which comes from Prof. Ohta's lab at U. Tsukuba in Japan. (Be warned that this image pair is much larger and should probably be downsampled).

(a) Identify the approximate range of disparity values that is present in the sawtooth stereo pair. (The easiest way to do this is to open them in an image editing program that displays the position of the mouse in image coordinates, and examine corresponding features manually.)

(b) Using a window size of 7 (N = 7) and a disparity of 14, compute an SSD image for the sawtooth pair. Save your SSD image as a  gray-scale .bmp file (you can use your display code to do this). Submit this file along with executable for your program (The executable should compute and display the sawtooth SSD output.)

Instructions for submitting part 1: Make a zip file with the suffix '-track2' containing your source code, executable, and the SSD image from question 5(b), encoded as gray-scale .bmp image file. You will also need to answer the questions above. This part does not need to be submitted electronically. Hand-written solutions are fine as long as they are legible. Follow the instructions on the course web page for submitting the zip file.