This chapter is about how to match the pixels in two images which are generated from a stereo system. In chapter 7, the author talks about how to measure the depth of a point in 3D space according to a pair of pixels from two images. This chapter is about how to get the pixels pair.
In a stereo system, to be more precisely, a canonical stereo system, the same point in a 3D space is mapped to the pixels in the same row in left and right image. Thus what we need to do is just to march the pixels in the same row. In the book, this is called to handle the disparity of the two images.
The general framework of stereo matching is also formulated into a labeling problem. Because in fact we are just trying to label the pixels in one image with the pixels in the other image. Then again, we will have an energy function to minimized. The minimization in this chapter is achieved through two techniques: dynamic programming and belief propagation.
To minimize the objective function through dynamic programming, we need constraints. In stereo matching, this constraint is the ordering constraint. The ordering constraint refers to the requirement that the labels assigned to the pixels in the same rows should follow an monotonous order. The labels should be either decreasing or increasing. This constraint I think can also help to reduce the search space. The stage of DPM defined here I suggest follows the scan order. For example, if the scan order is from left to right, the assignment is conducted from left to right. For the assignment of the nth pixels, the n-1assigned labels will be taken into consideration.
For applying the ordering constraint, the objective function is reduced be a form which contains only data term only. The matching only taking the information in the epipolar line. A better constraint will be the the smoothness constraint. This constraint takes the neighborhood of a pixel into consideration. The smoothness term defined here is a two step pots model. The smoothness term penalized the disparity difference according to its magnitude. The optimization process is similar. The adavantages of using smoothness constraints is that we can use that it allows multi direction scan lines. We can conducted matching in horizontal or vertical or dialogue direction. The multiple direction matching can help to improve the accuracy.
Since the labeling problem can also be solved with the belief propagation, the stereo matching can also be achieved through this technique. The difficulty lies in the message transferring equation. However I am not going into much details of it. In general, the message is passing through the message board. The property of the message passing equation is that massage from high contrast area to low contrast area is easier than the reverse direction. The image transferring is easily blocked by step edge.
At the end of this chapter, the author introduce a third eye technique. The third eye technology introduce a third camera and record the same space. Then through coordinate transformation and the disparity calculated from stereo matches, we are able to generate a virtual image. Then we can compare the virtual image and the matched image to evaluate the stereo matching result. Thus the third eye technology is for evaluating and improving the results of stereo matching. To calculating the similarity between the virtual image and the matching result, usually we calculate the NCC(normalized cross correlation).
In this chapter, the author also talks about why some times the SSD or SAD metrics is not suitable in the stereo matching. There are five reasons: invalidity of the ICA assumption, local reflectance differences, differences in camera, perspective distortion, and no unique minimum.