1. Field of the Invention
This application is based upon and claims the benefit of priority from Japanese patent application No. 2006-233016, filed on Aug. 30, 2006, the disclosure of which is incorporated herein in its entirety by reference.
The present invention relates to a system, a method, and a program for learning a parameter for identifying the type of an object in a video. More specifically, the present invention relates to an object identification parameter learning system, an object identification parameter learning method, and an object identification parameter learning program, which can identify the type of the object highly efficiently without being easily affected by the background.
2. Description of the Related Art
A method to specify an area with an object from a video, which corresponds to preprocessing for identifying an object, has already been proposed. For example, a method depicted in Non-patent Document 1 (“A system for Video Surveillance and Monitoring: VSAM Final Report” by Collins, Lipton, Kanade, Fujiyoshi, Duggins, Tsin, Tolliver, Enomoto, and Hasegawa, Technical report CMU-RI-TR-00-12, Robotics Institute, Carnegie Mellon University, March 2000) is capable of obtaining a silhouette and a circumscribed rectangle of an object on an image based on a change in the luminance of pixels in time-series images.
Japanese Unexamined Patent Publication 2005-285011 (Patent Document 1) and Non-patent Document 1 disclose conventional simple methods for identifying the type of an object, for the case where it is already known that there is an object within a specific area as mentioned above. The method of Patent Document 1 identifies the type based on a size of the object on the image. The method of Non-patent Document 1 identifies the type by using an aspect ratio of an area of the object on the image. Even though the processing of those two methods is simple, it depends largely on the placed condition of an image input device such as a camera. In addition, it lacks information for identifying the type clearly. Therefore, it is not possible to achieve a high identification performance.
Incidentally, Non-patent Document 1 also discloses an identifying method using a silhouette of an object. This method finds the centroid of a silhouette of an object, and a distance between the object and a dot sequence on a boundary line of the background. Then, the method identifies whether the object is a pedestrian, a vehicle, or something else, based on an increase or decrease in a distance value when making a round of the dot sequence in a clockwise direction. This method is capable of achieving a high identification performance when the silhouette shape such as the head part and the limbs of a person can be obtained clearly. However, it is difficult with this method to identify a target whose silhouette shape is unclear, such as a person who is making a move other than walking.
Japanese Unexamined Patent Publication H11-203481 (Patent Document 2) discloses a method for identifying an object based on a motion vector of the object. This method finds motion vectors by each small area within the area of the object, and identifies whether the object is a person or a vehicle based on the uniformity in the group of motion vectors. This method utilizes such characteristic that the motion vectors within the area are not uniform when a person moves therein, while the motion vectors are uniform in the case of a vehicle. Therefore, this method is not suitable for identifying other types of objects. Further, it is also an issue that the object cannot be identified when the object temporarily stands still in the video, since the motion vector cannot be obtained.
Non-patent Document 2 (“A Statistical Method for 3D Object Detection Applied to Faces and Cars” by H. Schneiderman, T. Kanade, Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, Volume 1, p. 1746, 2000) discloses a method for identifying the type of an object through a statistical pattern recognition technique based on the texture of the object on an image, that is, based on the appearance of the object. This method is capable of obtaining a high identification performance. However, the identification performance is affected by a fluctuation in the background, since this method identifies not only the object itself but also the image including the background.
Examples of the typical statistical pattern recognition technique are the learning vector quantization, neural network, support vector machine, subspace method, optimization of identification functions, K-nearest neighbor identifying method, decision tree, hidden Markovian model, and boosting.
Among those, the learning vector quantization is described in detail in Non-Patent Document 3 (“Character Recognition using Generalized Learning Vector Quantization” by Atsushi Sato, IEICE Technical Report, PRU95-219, 1996). In the statistical pattern recognition method, a parameter used for identification is obtained by using a large number of learning samples and teacher data that expresses the types of the learning samples. Hereinafter, the action to obtain this identification parameter is expressed as “learning”.
The conventional object identifying methods have at least one of the three issues described below.
The first issue is that the identification performance achieved thereby is not high.
The reason for this is that information used for identification is evidently insufficient with the methods that are too simple.
The second issue is that the targets to be identified are limited.
The reason for this is that those methods utilize the characters that are limited to pedestrians and vehicles.
The third issue is that the identification performance thereof is easily affected by the backgrounds.
The reason for this is that those methods identify the object while including the information of the background that is irrelevant to the object.