A category is a term in a pattern recognition field and indicates a classification of patterns. This may be referred to as a class. In general term, this is referred to as a “kind” or “group”. For example, when an image is identified as “face” or “not-face”, the categories are two, “face” and “not-face”. In the case of “child's face”, “adult face”, “old person's face” and “not-face”, the number of categories is 4. The pattern indicates any data such as image, sound and character.
Hierarchical categories, namely categories having hierarchical structure, indicate categories having a hierarchical tree structure having one or more roots or categories regarded to have the same. A category near the root is referred to as high order. A category near a leaf of the tree is referred to as low order. FIG. 1 is a tree diagram showing an example of the hierarchical categories according to conventional art. The figure shows the tree structure having three kinds, “hand”, “face” and “other”, of root.
Japanese Laid Open Patent Application (JP-P 2000-76382A) discloses one example of learning method for identifying hierarchical category. Dictionaries according to the method are hierarchical dictionaries consisting of low order dictionaries of detailed classification and high order dictionaries of classification. The dictionaries of detailed classification are constructed by using a learning database. The high order dictionaries of classification consist of templates classified based on the features of templates of the low order dictionaries of detailed classification. By the way, the template is a pattern to be referred in identification. As a synonym, there is a term of reference vector. FIG. 2 shows a configuration of dictionary data according to conventional art. The learning method having such configuration of dictionary has a problem that the number of the templates is large. The dictionary has templates of all hierarchies from template of high order hierarchy to template of low order hierarchy, separately.
On the other hand, as a learning method of the dictionary configuration in which templates of high order hierarchy and template of low order hierarchy are not separated, for example, the following method is easily considered. That is, templates corresponding to the lowest order hierarchy are prepared by learning while disregarding the hierarchical structure. In the method, conventional template preparing procedures which do not take hierarchical categories as targets, such as LVQ (Learning Vector Quantization), are used. However, there is another problem that the dictionary of templates prepared by the method can identify categories of the lowest order hierarchy but can not necessarily identify categories of high order hierarchy correctly. The reason will be described bellow.
FIG. 3A and FIG. 3B are schematic diagrams showing conventional learning method for the dictionary configuration in which templates of high order hierarchy and templates of low order hierarchy are not separated. Examples in the case that LVQ2 is used for updating the templates are shown. A learning rule of LVQ2 is improved than that of LVQ1. A circle symbol represents a template (first order hierarchy category number 1, second order hierarchy category number 1). A rectangle symbol represents a template (first order hierarchy category number 1, second order hierarchy category number 2). A triangle symbol represents a template (first order hierarchy category number 2, second order hierarchy category number 3). A diamond symbol represents a learning sample pattern (first order hierarchy category number 1, second order hierarchy category number 1). A vector (arrow) represents a direction and magnitude of template updating. A line represents a boundary between categories. FIG. 3A shows the case that templates are updated based on the categories of the lowest order hierarchy. FIG. 3B shows the case that templates are updated based on the categories other than the categories of the lowest order hierarchy.
The reason why the categories of the high order hierarchy can not necessarily be identified correctly although the categories of the lowest order hierarchy can be identified is that the learning based on the categories of the lowest order hierarchy may be not appropriate for classifying categories based on the high order hierarchy in some case. As shown in FIG. 3A and FIG. 3B, a template close to the learning sample pattern is selected from each of answer category and rival category and each of the selected templates is updated. The template close to the learning sample has high similarity to the learning sample pattern. The sample pattern means a pattern extracted from a parent population for same purpose. In general learning method, a large number of patterns are prepared for the learning. Such pattern used for learning is referred to as learning sample pattern or training pattern. A category to which the learning sample pattern belongs is referred to as answer category or teacher data. A category to which the template belongs is not referred to as answer category. A category number assigned to the answer category is referred to as answer category number. In an actual program, one number is assigned to one category. FIG. 3A shows an example of update of templates. The update is appropriate for the categories of the lowest order hierarchy but is not appropriate for the categories of the high order hierarchy. On the other hand, FIG. 3B shows an example in which an appropriate update of templates is executed for the category of the high order hierarchy.
As related art, Japanese Laid Open Patent Application (JP-P 2002-334303A) discloses a character identifying apparatus. The character identifying apparatus is equipped with a standard pattern dictionary which includes standard patterns each of which represents a feature of character. The character identifying apparatus identifies a character pattern as a target of identification based on the comparison between the character pattern and the standard patterns. Moreover, character statistic information of record is prepared, the character is weighted based on the character statistic information and the result of the character identification is outputted. At this time, the result of identification is obtained based on the result of comparison between the feature of the character pattern and standard patterns and a weight used for the weighting. The result of identification is outputted.
As another related art, Japanese Laid Open Patent Application (JP-P 2001-184509A) discloses a pattern identifying apparatus. The pattern identifying apparatus extracts an m dimension reference patterns and an m dimension feature from an n (m<n) dimension reference patterns in a dictionary for detailed identification and an n dimension feature extracted from a learning pattern, based on a current feature selection dictionary. After that, the reference pattern closest to the m dimension feature is extracted as a reference pattern A from the m dimension reference patterns belonging to the same category as the learning pattern belongs. The reference pattern closest to the m dimension feature is extracted as a reference pattern B from them dimension reference patterns belonging to another category different from the category to which the learning pattern belongs. The feature selection dictionary is updated such that the distance between the m dimension feature and the reference pattern A becomes shorter and that the distance between the m dimension feature and the reference pattern B becomes longer.
Moreover, as another related art, Japanese Patent Publication No. 2779119 discloses a learning identifying apparatus. The learning identifying apparatus is provided with a means for calculating for a plurality of category groups. The means for calculating calculates a group membership degree as degree of belonging of an input pattern signal to the category group of which a major classification section is composed of a set of similar patterns. An in-group similarity calculating section calculates an in-group similarity as a degree of belonging of the input pattern signal to a category in each category group. A plurality of multipliers calculate the products of the in-group similarities and the group membership degrees. An identification signal weighting section weights the in-group similarities. A category identifying section compares the weighted in-group similarity signals. Moreover, a teacher signal necessary for learning is generated, and an amount of weight change of the in-group similarity calculation section is controlled based on the teacher signal, an output from the identification signal weighting section and an output from the category identifying section. The plurality of multipliers calculate the product of a learning control signal outputted from the learning controller section and the group membership degree outputted from the major classification section. The learning control signal is weighted, and a weight coefficient of the in-group similarity calculating section is updated based on an output from the identification signal weighting section and an output from the in-group similarity calculating section.
As another related art, Japanese Laid Open Patent Application (JP-P 2003-123023A) discloses a character identifying method. This character identifying method includes: a step for obtaining an optimum binarization threshold for a input gray scale image by inputting a gray scale distribution of the input gray scale image, referring to examples of gray scale distribution in an estimation table provided in advance, selecting a gray scale distribution which is most similar to the gray scale distribution of the input gray scale image, obtaining an initial threshold for the input gray scale image, and adding a threshold difference value accompanying the gray scale distribution to the initial threshold; a step for binarizing the input gray scale image based on the optimum binarization threshold; a step for cutting out a character region from the binarized image; a step for calculating a similarity between the character region and each template of a template dictionary provided in advance; and a step for determining a character of a category of a template of the highest similarity among the similarities as a identification result.
Learning methods such as the above conventional art can be applied to an identifying apparatus for identifying data pattern such as character and image. The identifying apparatus means the apparatus which outputs a category as an identification result based on an input of a pattern. The identifying apparatus is often implemented as a program or a function in a program. FIG. 4 is a block diagram of a typical identifying apparatus, namely, a system for identifying pattern. In FIG. 4, a preprocessing and feature extracting section 10 preprocesses an input pattern, extracts a feature thereof and converts it into a feature pattern. An identification section 20 identifies a category to which the feature pattern belongs, obtains a category label and outputs it as an identification result from the system. As a method of feature extraction, for example, in the case of an image pattern, a conventional feature extracting method can be used which is disclosed in Japanese Laid Open Patent Application (JP-A-Heisei, 01-321589).
The category label is a value defined in the program and is assigned to each category. For example, when the images of a human and a monkey are given as the input patterns, “0” is assigned to the human image and “1” is assigned to the monkey image. The category label is not required to be a numeral. For example, a character string of “human” may be returned for the human image and a character string of “monkey” may be returned for the monkey image. The category labels are only required to distinguish the categories.
Conventionally, an identifying method based on a pattern representing a category is known as a method of identifying pattern corresponding to the identification section 20 in FIG. 4. Such a pattern is referred to as a template, a representative vector, a reference vector or a prototype. Each category may have one template or a plurality of templates. Generally, the template is a pattern having the same data format as the input pattern after the feature extraction.
When the feature pattern in FIG. 4 is identified, the similarities between the feature pattern and the templates are evaluated by using a certain scale, and the feature pattern is identified that it belongs to the category to which the template of the highest similarity belongs. The similarity means a degree indicating the resemblance between a pattern and another pattern. As the simplest example, a correlation value or a Euclidean distance is used to calculate it. The scale of similarity is only required to evaluate the resemblance. For example, there are: a method in which the feature pattern and template are regarded as vectors, a Euclidean distance between the two vectors is calculated and the similarity is determined to be higher as the distance is shorter; and a method in which the inner product of the two vectors is calculated and the similarity is determined to be higher as the angle between the two vectors is smaller.
As a method in which the category is determined based on the similarity between the feature pattern and each of plurality of templates, there is another method different from the method in which the feature pattern is classified to the category to which the template of the highest similarity belongs. For example, there is a method in which number k of templates of high similarity are obtained, a majority operation is carried out for the category labels to which the number k of templates belong and the feature pattern is classified to the category of the largest number.
As a template preparing method for the above mentioned identifying method in which the identification is carried out based on the similarities between the feature pattern and templates, namely, as a conventional method known as learning method of templates, there are: “method in which average of learning sample patterns is simply obtained”, “K-means method”, “LVQ2.1 (leaning vector quantization 2.1)” (T. Kohonen, Self-Organization and Associative Memory, Springer-Verlag, 1989), and “Generalized LVQ (generalized leaning vector quantization)” (SATO Atsushi, Character Recognition using Generalized Learning Vector Quantization, IEICE (The Institute of Electronics, Information and Communication Engineers) technical report). “Method in which average of learning sample patterns is simply obtained” is a method in which an average pattern of learning sample patterns belonging to one category is used as a template and in which one template is prepared for one category. “K-means method” is a method in which learning sample patterns belonging to one category are separated in a plurality of sets and in which an average pattern of the learning sample patterns in each set is used as a template. In “K-means method”, based on the similarity between the template and sample pattern at a step, away of separation of the sample patterns is changed in the next step and then the templates are updated twice or more. One or more templates are prepared for each category by applying K-means method to the learning sample patterns belonging to each category. In “LVQ2.1”, a Euclidean distance between each template prepared in some method and a learning sample pattern is obtained, the templates are updated such that the template of shortest distance among templates belonging to the same category as the answer category of the learning sample pattern is moved near the learning sample pattern and the template of shortest distance among templates belonging to the different category to the answer category of the learning sample pattern is moved away from the learning sample pattern. “Generalized LVQ” is a method including the procedures of “LVQ2.1”. In “Generalized LVQ”, the convergence of the update of templates is assured.
In the conventional art, there is a first problem that a size of dictionary of templates for identifying hierarchical categories is large. The reason is that the templates are possessed for each category of each hierarchy from high order hierarchy to low order hierarchy. When the template is prepared by using the conventional learning method based on the categories of the lowest order hierarchy, there is a second problem that a good performance is not assured for the category of high order hierarchy. The reason is that the learning based on the categories of the lowest order hierarchy is not appropriate for classification of the categories based on the high order hierarchy in some case. The conventional art has one of the two problems above mentioned.