The present invention pertains to using sequential, parallel and systems of computers to understand noisy textual data. The output of automatic speech recognition software may contain numerous types of errors, including but not limited to extraneous words, missing words, misrecognized words, words out of sequence as well as any combination of these items and other errors. In the past, ASR output contained as many as 15% misrecognized words, and as many as 30% to 80% of sentences contained errors. Even today, modern voice recognition systems typically misrecognize up to 10% of the words.
Automatic speech recognition systems (ASR) use numerous and various techniques to produce the best possible output. Predominately, grammars, language models, statistical and probabilistic methods are used to improve recognition rates. In addition, ASR post processing algorithms use lexical statistical methods, voting and minimum edit distance of corrections based on domain knowledge and morphological and query template information.
Traditional grammars are syntax oriented and context free and their implementations are of limited usefulness in the presence of noisy, uncertain, missing and/or redundant information. In recent years a concept based approach has been introduced. It has been applied to “error-tolerant language understanding (U.S. Pat. No. 7,333,928)” using a predefined concept/phrase grammar. However, grammars in general have several weaknesses. They are domain specific, difficult to modify and maintain and their creation requires considerable expertise and is time consuming.
Thus there is a need for a semantic based system that relies on easily augmented and modified thematic patterns. The domain vocabulary of the system needs to be dynamic and noise tolerant. The system needs to be suitable for sequential and parallel computers. The output of the system needs to be in a semantic form that can be easily understood by the user and can be converted into forms utilized by internal and external digital devices.