1. Field of the Invention
The present invention relates to a learning device, a learning method, an identification device, an identification method, and a program, and more particularly to a learning device, a learning method, an identification device, an identification method, and a program which can improve both discrimination and invariance in the identification of whether or not a subject viewed in an image is a predetermined identification object.
2. Description of the Related Art
A method of performing matching using a template where identification objects are broadly described exists as a method of identifying an object as an identification object located within an image from the image captured by a camera.
That is, the identification method in the related art prepares a template where identification objects are broadly described, that is, a template of textures of all identification objects, and matches an image of an object to be identified (an object to be processed) with the template.
However, it is difficult to process a hidden or distorted part of the identification object viewed in an image as an object to be processed in a matching process using the template where identification objects are broadly described.
There is a method of observing a local area of an image to be processed, extracting feature quantities from each local area, and performing an identification process by employing a combination of the feature quantities of the local area (a set of the feature quantities of the local area), that is, a vector using the feature quantities of each local area as components.
When a set of feature quantities of a local area is used, a high-precision identification process may be performed by partially solving the problem of a hidden or distorted part of an identification object which is difficult to be processed in the method using a template where identification objects are broadly described.
A feature quantity of a local area is used for object category identification as well as individual object identification. For example, a method of identifying a specific category such as the face of a person or the like using a feature quantity of a local area has been proposed (for example, see P. Viola and M. Jones, “Robust Real-time Face Detection”, cvpr 2001).
Various frameworks for category identification have been proposed. For example, there is a framework using a histogram of BoF (Bag of Features) (for example, see G. Csurka, C. Bray, C. Dance, and L. Fan, “Visual Categorization with Bags of Keypoints”, ECCV 2004), a framework using a correlation of feature quantities (for example, see Japanese Unexamined Patent Application Publication No. 2007-128195), or the like as a framework proposed for the category identification.
For example, an SIFT feature quantity (for example, see D. Lowe, “Object Recognition from Local Scale-Invariant Features”, ICCV 1999) or an output (response) of a steerable filter (for example, see J. J. Yokono and T. Poggio, “Oriented Filters for Object Recognition: an empirical study”, FG 2004) have been proposed as a feature quantity of a local area for use in identification.