The present invention relates to decision trees. In particular, the present invention relates to automatically generating questions found in decision trees that are used in speech processing.
A decision tree is a connected set of nodes that begins at a root node and ends at one or more leaf nodes. With the exception of the leaf nodes, each node in the tree has an associated question and a set of child nodes that extend below the node. The decision tree is traversed by answering the question at a node and selecting one of the child nodes based on the answer. This question answering continues until the tree has been traversed from the root node to one of the leaf nodes.
In speech recognition, such decision trees have been used to reduce the number of acoustic models that are needed to decode speech. In particular, decision trees have been used to group triphone states together in the leaf nodes of the trees. A single phonetic model can then be provided for all of the triphones in a leaf node instead of providing a separate model for each triphone state.
Decision trees have also been used to identify pronunciations for words. In such decision trees, the leaf nodes contain alternative pronunciations for a letter in a given context and the questions in the tree determine which leaf node should be accessed for a given combination of input letters.
In the past, developing the questions used in a speech processing decision tree required detailed linguistic knowledge. For some languages, this knowledge is available from linguistic experts who craft the questions based on phonetic characteristics learned from a study of the language. However, such expert knowledge is not available for all languages and would be expensive to develop. As a result, the production of the decision tree questions represents a barrier to developing decision trees for many languages.