2009年3月22日

Shape Matching and Object Recognition Using Shape Contexts

Title: Shape Matching and Object Recognition Using Shape Contexts
Author: Serge Belongie, Jitendra Malik, Jan Puzincha
Publisher: IEEE
Month of Publication: April 2002

The paper proposes a stable and simple algorithm for finding corresponding between shapes. They introduce a shape descriptor, shape context, and maximize the similarity in the bipartite graph. The demonstration of 2D objects, e.g., handwritten digits, silhouettes, and trademarks, and 3D objects from Columbia COIL data set shows the improved performance.

First, a rich local descriptor, shape context, is proposed in order to match easier. It considers the set of vectors originating from a point to all other sample points on a shape. For a point on the shape, compute a coarse histogram of the relative coordinates of the remaining points according to the bins in log-polar space, making nearer points have more weighting. Use chi-square test statistic to be the cost C and identify the similarity. Second, when minimizing the cost of bipartite graph matching, they consider the scale invariance by normalizing all radial distance and rotation invariance by turning relative frames with the tangent angle. Moreover, one can add “dummy” nodes to get robust handling of outliers. Third, in the modeling transformation, they use the thin plate spline (TPS) model which includes the affine model, it is possible to estimate transformations in few iterations. Finally, estimate shape distances as the weighted sum of three terms: shape context distance, image appearance distance, and bending energy. And then apply the prototype-based approach. That is, use a variant of K-means, K-medoids, to select a ideal example for each category, and classify the query shape based on the minimal cost.

In conclusion, it is able to retrieval the objects which have similar shapes with the query, and the performance is really improved.

In my opinion, it is thoughtful that they consider several key points and construct the distance weighting function based on three term. However, it is possible that more parameters may cause biased query results. Maybe they should show individual statistic for each terms and convey us that the weighting function is really believable. Besides, I think that the modified K-means may be helpful in some cases. The outlier removal and warped transformation seems useful to remove noises according to the figure 4.

0 comments: