The object recognition has been on of the major problems in computer vision.
There are several approaches to solve the problems about object recognition in real environment. One of the most common approaches for recognizing object from a measured scene is a model based recognition method. It recognizes the objects by matching features extracted from the scene with stored feature of the object. The model based recognition method was introduced in an article by M. F. S. Farias et. al., entitled “Multi-view Technique For 3D Polyhedral Object Rocognition Using Surface Representation”, Revista Controle & Automacao., pp. 107-117, 1999, in an article by Y. Shirai, entitled “Three-Dimensional Computer Vision” New York: Springer Verlag, and an article by J. Ben-Arie et. al., “Iconic recognition with affine-invariant spectral”, In Proc. IAPR-IEEE International Conference on Pattern an Recognition, volume 1, pp. 672-676, 1996. Furthermore, there were several methods introduced to recognize object using predefined model information.
Fischler and Bolles introduced a method for recognizing an object using RANSAC in an article entitled “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography” in Comm. Assoc. Comp. Mach, 24(6):381-395, 1981. In the method, all points on a scene are projected and it is determined if projected points are close to those of detected scene. Then, an object is recognized based on the determination result. This method is not so efficient because of iterative hypothesis and verification tasks. Olson proposed pose clustering method for object recognition in an article entitled “Efficient pose clustering using a randomized algorithm” in IJCV, 23(2):131-147, June 1997. As for disadvantages of this method, data size is quite big because pose space is 6-dimensional and pose cluster can be detected only when sufficient accurate pose becomes generated. David et al. also proposed recognition method in an article entitled “Softposit: Simultaneous pose and correspondence determination” 7th ECCV, volume III, pages 698-703, Copenhagen, Denmark, May 2002. In David's recognition method, matching and pose estimation are solved simultaneously by minimizing energy function. But it may not be converged to minimum value by functional minimization method due to high non-linearity of cost function.
In addition, Johnson and Herbert proposed a spin image based recognition algorithm in cluttered 3D scenes in an article entitled “Using spin images for efficient object recognition in cluttered 3D scenes”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, May 1999. Furthermore, Andrea Frome et al. compared the performance of 3D shape context with spin-image in an article entitled “Recognizing Objects in Range Data Using Regional Point Descriptors”, European Conference on Computer Vision, Prague, Czech Republic, 2004. Jean Ponce et al. introduced 3D object recognition approach using affine invariant patches in an article entitled “3D Object Modeling and Recognition Using Affine-Invariant Patches and Multi-View Spatial Constraints”, CVPR, volume 2, pp. 272-280, 2003. Most recently, several authors have proposed the use of descriptor in image patch in an article, for example, by D. Lowe, entitled “Object recognition from local scale invariant features”, Proc. 7th International Conf. Computer Vision (ICCV' 99), pp. 1150.1157, Kerkyra, Greece, September 1999.
Another approach to recognize an object is a local shape features based method which is inspired by the shape context of Belongie et al. in an article “Shape matching and object recognition using shape contexts”, IEEE Trans. On Pattern Analysis and Machine Intelligence, 24(4):509-522, April 2002. Owen Carmichael et al. introduced another recognition method in an article entitled “Shape-Based Recognition of Wiry Object”, IEEE PAMI, May 2004. In this method, a histogram or shape context is calculated at each edge pixel in an image. Then, each bin in the histogram counts the number of edge pixels in a neighborhood near the pixel. After searching nearest neighbor and measuring histogram distance, the method determines correspondences between shape contexts from a text image and shape contexts from model images. But this method may not be effective when the background is concerned. To solve this problem, assessing shape context matching in high cluttered scene have studied by A. Thayananthan et al. in an article entitled “Shape context and chamfer matching in cluttered scenes” Proc. IEEE Conference On Computer Vision and Pattern Recognition, 2003.
Except for the above methods, there were many of object recognition researches introduced. However, most of these methods are working well only at the condition under accurate 3D data or fully textured environments in single scene information with limited feature.