2009年3月3日

Distinctive Image Featuresfrom Scale-Invariant Keypoints

Title:Distinctive Image Featuresfrom Scale-Invariant Keypoints
Author:David G. Lowe
Date of Publication: January 5, 2004

This paper describes an approach, Scale Invariant Feature Transform(SIFT), which transforms image data into scale-invariant coordinates relative to local features. There are several major stages of computation.

(1) The scale-space extrema detection stage searches all scales and image location and uses a cascade filtering approach to identify potential interest points that are invariant to scale and orientation. That is, compute the scale space of an image, a function produced from convolution of a variable-scale Gaussian with an input image, for scale feature description across all possible scales, and then use the function to compute the difference-of-Gaussian and find the maxima and minima of the DOG which can produce the most stable image features.

(2) The keypoint localization stage fits detailed model to determine location and scale at each candidate location and selects keypoints based on measures of their stability.

(3) The orientation assignment stage assigns one or more orientations to each keypoint location based on local image gradient directions; therefore, achieve invariance to image rotation.

(4) The keypoint descriptor stage measures the local image gradients at the selected scale in the region around each keypoint, which allows for significant levels of local shape distortion and change in illumination. In other words, a keypoint descriptor is created by computing the gradient magnitude and orientation at each image sample in a region. After weighted by a Gaussian window, the samples are accumulated into orientation histograms summarizing the contents over subregions. To improve the effects of shift and illumination change, trilinear interpolation, vector normalization and thresholding are taken into consideration.

In experiment on object recognition, an approximation algorithm, Best-Bin-First(BBF), is used for efficient nearest neighbor indexing to find minimum Euclidean distance in matching. Take Hough transform to cluster features. After solving affine parameters by least-square, accept a model if final probability is higher than a threshold.

0 comments: