Processing for estimating a specific parameter from an input image is general processing in pattern information processing. An example of such processing includes processing of extracting positions of the eyes and nose from an image of a person's face and processing of extracting a position of a number plate from an image of a car.
Conventionally, the most popular method for such processing is a so called matched filter method of carrying out comparison and collation by the use of a template, and many examples of using this method have been proposed. An example of a method of extracting features of a face based on such a method is described in detail in R. Brunelli, T. Poggio, “Face Recognition: Features versus Template”, IEEE Trans. Patt. Anal. Machine Intell., vol. PAMI-8, pp. 34-43, 1993.
Such a conventional method has a problem that processing time is long, or processing cost is accordingly increased. In the case where a normalized correlation is employed as a similarity standard, when the number of pixels of an input image is denoted by S, that of a template is denoted by T and multiplication is employed as a unit operation, 2×T×S times operations are needed. When this is applied to extraction of features of a face image, when S is 150×150=22500 (pel) and T is 50×20=1000 (pel), 2×1000×22500=45 million times of multiplications are needed. Although an operating speed of a computer has been certainly improved, an enormous operating cost is needed.
In many cases, as a template used for collation, data such as an average of all learning data are used. Therefore, collation cannot be carried out successfully depending upon the environment. Consequently, there is a technique for preparing a plurality of templates corresponding to input images and calculating the similarity. However, since the processing amount is increased in accordance with the increase of the number of templates, the processing time of a computer is further increased.