Concise Computer Vision
《Concise Computer Vision》的全部笔记 10篇

读书笔记 Feature Detection and Tracking

This chapter is about the feature detection and how to tracks the feature in the video sequences. I suggest that this chapter focuses more on tracking than detection. This is reasonable. Because the detection techniques in this book are focus on the geometric or intensity related features.

The chapter begins with the definition of the chapter. A feature is defined with a key point and a descriptor. In my understanding, the key point is used for locating the feature and acts as an anchor. Usually, the key point is defined together with a descriptor. There are three definitions introduced in this chapter. The key point can be either defined through phrase congruency, LoG or DoG space and 3D vector flow. In all these definition, the pixel is a key points and the corresponding function applied to it is a descriptor.

The invariance of the feature is important for detection and tracking in image sequence of videos. Because if a feature is variant in different images or a sequence or frames in a video, we can not justify whether a feature is disappeared or transformed due to variance. The chapter mentioned two kinds of invariance: invariance with respect to scene and camera used.

The invariance of the key point can be evaluated through RANSAC (Random Sample Consensus). In this book, the RANSAC is used to test the correspondence of a descriptor. That is to say whether a same key point can be find in two subsequent image. The idea of the RANSAC is to use part of the data to estimate the parameter of a model, and use the remaining part of the data to test the accuracy of the model. The selection of the sample for parameterize the model is random.

In this chapter, three different features are introduced:SIFT, SURF and ORB.

SIFT stands for Scale Invariance Feature Transform. It is defined based on the LoG or DoG space. It has characteristics of rotation invariance, size Invariance and brightness invariance. itcalculates the vectors in the disk of influence. All the calculated vectors all together define the feature of the disk.

SURF refers to Speed Up Robust Feature. It's idea is similar to SIFT, but have better time performance. It is also defined on LoG or DoG space but its calculation takes advantages of the integral images which has been mentioned in previous chapter. Its calculation has lower computational cost.

ORB is the oriented Robust Binary feature. To understand ORB, we need to have knowledge of binary pattern and BRIEF. The binary pattern of a key point is like a polynomial function witch sum ups the pixel distance on the boundary of the influence disk. The BRIEF is similar but the terms for sum up in the calculation here is the distance between random pixel pairs rather than center-boundary pixel pairs. The ORB is a combination of a oriented FAST(the corner detector) and the rotated BRIEF.

To test the invariance of the feature above, we can try to detect the identical key point in the transformed image(rotated, resized, blurred and modified in brightness). Through comparing the number of key points in the original images and transformed images, we can evaluated the whether a feature is in variance.

After discussion of the feature detection, the chapter goes to the topic of feature tracking. The feature tracking is a sparse correspondence problem. On this topic, three different tracker is introduce. They are Lucas-Kanade tracker, particle tracker and Kalman filter.

The Lucas-Kanarese tracker is based on Newton method. The tracker is an objective function that measure the dissimilarity of the feature of the same key point in two subsequent frames. The minimization of the objective function is achieved throw Newton iteration which move to the optimum along the gradient lines.

The particle tracker is similar to the particle algorithm. The particle space is model dependent. That is to say the dimension of a particle depends on how many parameters (related to the feature) we used to define a particle. Each particle has its own weight which reflects it importance and decide whether it can survive in the next iteration. Through keep updating the weight of particles and condensing the particles, we can track a feature and object.

The Kalman filter is a powerful tools. I have heard about it in many application. The Kalman filter under the topic of feature tracking is more like a statistical tool. The changes on the images or feature is described with a discrete dynamic system. We try to look for solution of the system on the time t according to the information in t-1. The solution needs to maximize the Kalman gain which is a matrix that minimize the squared error. The Kalman filter originates from here. The idea behind the Kalman filter is more "noisy data in, less noisy data out".

我们的精神角落