The present invention relates to a pattern recognition system for recognizing an input pattern to output a category string as a recognition result and, more particularly, to an improvement in the pattern recognition system for selecting a valid category string as the recognition result from a category candidate graph which is obtained from the input pattern.
An input pattern can be recognized if the categories contained in the input pattern are recognized individually and linked together to form a category string. The thus obtained category string can be used as the recognition result of the input pattern.
It is, however, generally difficult to recognize the respective categories in the input pattern individually. Where the boundary between the categories in the input pattern is ambiguous, as in the case of speech or an utterrance pronounced continuously and which is to be recognized as a syllable sequence, it is difficult to determine the segments in which the individual categories of the input pattern are positioned.
Therefore, the following method has been used in the prior art, as is disclosed in Japanese Patent Laid-Open No. 58-55995 entitled "Speech Recognition System".
At the stage where the respective categories (e.g., syllables in the above-specified Japanese Patent Application) in the input pattern are to be recognized individually, several segments for the respective categories and several category candidates are obtained. The respective candidates are assigned measures for indicating the recognition accuracies. (These measure will be called "costs" in the following, with a lower cost representing a better recognition accuracy. Similarity data may be used to establish recognition accuracy.)
As a result, for the input pattern, there can be obtained a directed graph in which the boundaries of the category segments in the input pattern are used as nodes, in which a plurality of branches are provided for the plural categories corresponding to the category segments and in which the category names and the costs are assigned to the respective branches. This directed graph will be called a "candidate graph" hereinafter. The start point or start node and the terminal point or terminal node of the candidate graph correspond to the start point and terminal point of the input pattern, respectively. A plurality of paths are present in a region from the start node to the terminal node of the candidate graph and are respective candidates for the category string corresponding to the input pattern. Moreover, the cumulation of the costs of all the branches on the path is employed as the cost for that path.
The candidates of the plural category strings obtained from the candidate graph contain a correct recognition result, which is generally a significant category string having some validity. Therefore, the recognition performance can be improved if linked candidates, which are valid and have the lowest combined cost is selected from the candidates of the category strings and is used as the recognition result.
In order to judge the validity of the category string, however, it is necessary to look up words in a dictionary, which are stored with valid category strings, or to judge possibility of connection of the category strings, so that a number of calculations are generally required. Therefore, judgements of significances of the category strings for all the aforementioned paths are not practical because very many calculations are required.
Thus, there is adopted a method by which the judgement of significance of the category strings is started consecutively from the category string having the lower path cost. Even in this case, however, many calculations and much memory capacity are required because the category strings are extracted consecutively from those having the lower path costs after all of them are determined.