1. Field of the Invention
This invention relates generally to computer processors, and more particularly to an efficient pruning algorithm which reduces computer processing unit loading during speech recognition.
2. Description of the Related Art
Previous Dynamic Time Warping (DTW) based speech recognizers have employed a traditional bottom up approach in which word-level or phonetic-level hypotheses were generated by an autonomous word hypothesizer. These hypotheses were then post-processed by a sentence-hypothesizer that used application specific knowledge (grammar) to choose the best sentence hypothesis from all grammatical candidates.
Recently, in "System and Method for Parsing Natural Language" (U.S. Pat. application Ser. No. 919,156) and "A Chart Parser for Stochastic Unification Grammar" (U.S. Pat. application Ser. No. 312,835), both assigned to the same assignee as the present application, a top-down approach to speech recognition is disclosed. Briefly, the word-hypothesizer is no longer autonomous but is guided by the sentence-hypothesizer. As a frame is processed, each active sentence hypothesis inquires for data as needed. The sequence of data requests typically begins with a sentence inquiring for word hypotheses, (i.e., a candidate word and the likelihood of its occurrence given the current history). These requests for a word hypothesis in turn request a phone hypothesis, and so forth. The process terminates with a request for a frame of speech data. At this point, the incoming frame of speech data is scored in the context predicted by this sentence hypothesis. Each level applies the constraints of grammar-like structures, or Hidden Markov Models (HMMs), to the next lower level of data representation.
FIG. 1, shows a block-diagram of such a layered grammar, or model-driven, approach to speech recognition. It has two principal features: a hierarchical structure that allows any number of levels of data representations to coexist and a continuous density HMM computational framework which governs the flow of information at all levels. The details of a system like that shown in FIG. 1 have been fully explained in "Chart Parser for Stochastic Unification Grammar" (U.S. Pat. application Ser. No. 312,835), assigned to the assignee of the present invention. It has been shown empirically that top-down hypothesizing provides a significant improvement in performance over previous bottom-up systems.
Unfortunately, the top-down model-driven approach used in the speech recognition scheme is computationally demanding in that it must operate in real time. Additionally a current speech recognition system needs a scoring buffer of several hundred kilobytes of data memory which is generally maintained in expensive fast random access memory. Therefore it is very desirable to reduce the amount of fast RAM used by a CPU, and thereby system expense, when processing a speech recognition algorithm.