Problem Set 4 
Stereo Reconstruction

CS 4495/7495
Computer Vision

Due by midnight Oct. 16, 2003

In this assignment you will complete the implementation of your stereo system and test its performance. You will exploit the SSD computation and DSI algorithm from the previous two assignments. This is the final part of a three-part stereo assignment. 

Be sure to do the following when you turn in the assignment:

Grading: Note that there are not two tracks for this assignment, since everyone needs a chance to try the full algorithm on real data. However, students in 4495 are only required to submit results for the sawtooth stereo pair in part 4 below. This PS will be graded on two separate curves.

1. [3%] Write a function DSI_forward which takes as inputs the SSD images from PS 2, a y coordinate, and an occlusion_cost, and returns DSI_cost and DSI_ptr for scanline y. As you will recall from PS 3, the DSI_cost is a 3-band D by W image, where D is the number of discrete disparities and W is the width of the scanline. Cell (d, x, s) in DSI_cost contains the cost of being in state s at location (d, x) in the DSI image. There is one band (a complete D by W image) for each of the three states M (s = 0), L (s = 1), and R (s = 2). DSI_ptr is a second 3-band D by W image containing the backpointers. Cell (d, x, s) in DSI_ptr holds an index value between 0 and 6, corresponding to the seven valid DSI transitions listed in figure (g) of PS 3. Due to boundary conditions, not all pixels in the input image will have SSD values. Your function should start at the first pixel with an SSD value and end with the last. Thus W above will equal the number of pixels in the left scanline for which SSD scores are available. 

2. [2%] Write a function DSI_backward which has inputs DSI_cost, DSI_ptr, y coordinate, and a global disparity_map image. The function should chain backwards from the lowest cost terminal state to find the optimal path. As it computes the optimal path, it should substitute the optimal disparity values into scanline y of disparity_map. When the optimal state for a given cell is L, there is no direct measurement of disparity available. A reasonable procedure would be to linearly interpolate the disparities from adjacent match states along the scanline. An approximation to this would be to report the disparity associated with L. You may do it either way. Use the match costs and occlusion_cost you designed in part (f) of PS 3 as the input to your program and compare your output to the result given in figure (f) of PS 3 to verify the correctness of your implementation.

3. [1%] Write a function compute_disparities that takes the SSD images and the occlusion_cost, calls DSI_forward and DSI_backward for each scanline, and outputs a complete disparity_map. The disparity_map is an image, defined with respect to the left image in the stereo pair, that contains a disparity estimate for each pixel. Due to boundary conditions, not all pixels will have disparity estimates. You should use a distinguished value for the disparity in these cases so they can be easily recognized. You may want to debug your code on reduced size images so the runtime is faster: [sawtooth-left-small.bmp, sawtooth-right-small.bmp].

4. [4%] Run compute_disparities on the two stereo pairs [sawtooth-left.bmp, sawtooth-right.bmp] and [tsukuba-left.bmp, tsukuba-right.bmp] which you examined in PS 2. [Note: Students in 4495 are only required to submit results for the sawtooth pair]. Experiment with different disparity ranges for these images. The visualization function you used to visualize the SSD values can also be used to visualize the disparity_map. Ground truth disparity images are available for these two cases: sawtooth-truedisp.bmp and tsukuba-truedisp.bmp. Arrange to view your output and the ground-truth side by side. Comment on your output. Where are you making mistakes? What types of mistakes are they? Where are you successful? What disparity ranges gave the best results? What choice of occlusion_cost gave the best results?

5. Extra Credit (If you do any of these, please provide instructions on how to run them as part of your executable)

(a) [+2%] Modify your program to compute the naive disparity estimate described in problem 2 of PS 2. This can be easily obtained from the DSI image by picking the disparity with the lowest match score for each pixel. Modify your program to display this new disparity estimate. Compare this result to the result with DP. What different properties do these two solutions have?

(b) [up to +4%] Implement one of the evaluation metrics from Section 5.1 of Scharstein and Szeliski (available from class web site). The full ground truth data is available here. Describe your implementation and report the results.

Instructions for submitting: Make a zip file containing your source code, your executable, and two disparity images in .bmp format named sawtooth-disp.bmp and tsukuba-disp.bmp, which correspond to your program output for the two stereo pairs. Your executable should compute and display the disparity image and the ground truth for the sawtooth case. You will also need to answer the questions above. Your answers do not need to be submitted electronically. Hand-written solutions are fine as long as they are legible. Follow the instructions on the course web page for submitting the zip file.