This invention relates to the automatic determination of a rule-based language model for speech recognition and, more specifically, refers to the generation of a decision tree for predicting the next word uttered based upon the previously spoken words.
There are two main approaches to pattern recognition: statistical and syntactic. In the statistical approach, patterns are classified and recognized on the basis of a statistical model, using classical hypothesis testing techniques. In the syntactic approach, rules are defined, often unrelated to statistical theory, in order to recognize a pattern.
In the present invention, rules are determined entirely automatically and are optimal in a well-defined and well-understood sense. The process while resembling an expert system, in fact has no expert.
The pattern recognition problem to be solved by the invention is predicting what word a speaker will say next, based upon the words already spoken. A procedure is disclosed for automatic determination of rules which enable the speaker's next word to be predicted. These rules function the same as the rules of grammar, semantics, and the like, subconsciously invoked by a human listener, but the rules are expressed in a different form.
The article entitled "An Information Theoretic Approach to the Automatic Determination of Phonemic Baseforms" by J. M. Lucassen et al, Proc. Int. Conf. on Acoustics, Speech and Signal Processing, 1984, pp. 42.5.1-42.5.4 describes automatically determined binary-decision trees for determining spelling-to-sound rules for English language words. This technique relies upon the fact that there is a small quantity of letters (26) in the alphabet. The technique cannot be extended to more complex problems such as language modelling where there is a very large quantity of possible words. The present invention differs from the prior approach in the manner of determining the questions, in the manner of terminating the trees and in the method of computing the probability distribution at the leaves of the tree.
While the present invention will most often refer to speech recognition and specifically to next word prediction, the described invention is equally applicable to any pattern recognition system in which a next event or next data predictor is based upon a past event or given set of data. For example, given a list of a patient's medical symptoms, what is the best course of treatment or what is the best diagnosis.