1. Field of the Invention
The present invention relates to a pattern recognition system for calculating a similarity (or distance) between an input pattern and a reference pattern in speech recognition or character recognition, and converting the similarity into a posterior probability to improve recognition precision.
2. Description of the Related Art
A pattern recognition method using a multiple similarity method is known when a character or speech input is to be recognized. However, a multiple similarity is a kind of a scale of similarity represented by an angle between a partial space defined by a standard pattern and a space defined by a feature vector of an input pattern, and its weight of the scale is quite different for each category of a standard pattern. Therefore, in order to improve recognition ratio, another method is proposed. In this method, the multiple similarity is converted into a posterior probability, and the posterior probability is used as an evaluation value of the similarity (Japanese Patent Disclosure (Kokai) No. 59-219799 published on Dec. 11, 1984).
Assuming that a calculated multiple similarity is Si, conditional probability P(Ci.vertline.Si) of similarity Si belonging to a certain category Ci is a posterior probability to be calculated. In order to calculate the probability distribution, probabilities of all similarities obtained between 0 and 1 (i.e., [0,1]) belonging to each category must be calculated, and it is hard to realize in a practical application. Thus, the conditional probability is developed using the Bayes' theorem as follows: ##EQU1## where P(Ci)/P(Cj) is a constant determined by the number of categories because each probability P(Ci) or P(Cj) indicates a probability that an input category is category ci or Cj. Therefore, if P(Si.vertline.Ci) and ##EQU2## are calculated, posterior probability P(Ci.vertline.Si) can be calculated.
P(Si.vertline.Ci) is a probability of multiple similarity Si obtained when data belonging to category Ci is recognized using a dictionary of category Ci.
P(Cj.vertline.Si) is a probability of multiple similarity Si obtained when data belonging to category Cj is recognized using a dictionary of category Ci.
Of these probability distributions, for example, as shown in FIGS. 1A to 1D, the distribution of similarities S1 when they are recognized using the same dictionary of a category (C1) as an input category (C1) is concentrated near 1.0. However, similarities S1 obtained when they are recognized using a dictionary of a category (C1) different from an input category (C2) are distributed at smaller values in a wider range.
Therefore, when such probability distributions are stored as tables, posterior probability P(Ci.vertline.Si) shown in FIG. 1D can be calculated by equation (1).
However, upon design of such a table, the following problems are posed.
That is, in order to convert a multiple similarity into a posterior probability, the shapes of the above mentioned two probability distributions are important. In this case, the shapes of the posterior probability distributions differ depending on individual categories (e.g., some categories are easily recognized, and others are not easily recognized; some categories are similar to each other, and others are not similar to each other; and so on). However, in order to calculate P(Si.vertline.Ci) and .SIGMA.P(Si.vertline.Cj) for each category, a large volume of data is necessary, and it becomes impossible to form a table.
Thereofre, in a conventional system, a common table is formed for all the categories at the cost of the recognition ratio, and conversion to a posterior probability is performed. Thus, the recognition ratio is inevitably decreased.
In a conventional pattern recognition system which converts a multiple simplarity into a posterior probability to evaluate a similarity, it is difficult to form a conversion table for obtaining a posterior probability for each category. Therefore, a simple table must be used instead, and hence, the recognition ratio is decreased.