The present invention relates to systems and methods for transcribing words from a form convenient for input by a human user, e.g., spoken or handwritten words, into a form easily understood by an applications program executed by a computer, e.g., text. In particular, it relates to transcription systems and methods appropriate for use in conjunction with computerized information-retrieval (IR) systems and methods, and more particularly to speech-recognition systems and methods appropriate for use in conjunction with computerized information-retrieval systems and methods used with textual databases.
In prior art IR systems, the user typically enters input--either natural-language questions, or search terms connected by specialized database commands--by typing at a keyboard. Few IR systems permit the user to use speech input, that is, to speak questions or search strings into a microphone or other audio transducer. Systems that do accept speech input do not directly use the information in a database of free-text natural-language documents to facilitate recognition of the user's input speech.
The general problem of disambiguating the words contained in an error-prone transcription of user input arises in a number of contexts beyond speech recognition, including but not limited to handwriting recognition in pen-based computers and personal digital assistants (e.g., the Apple Newton) and optical character recognition. Transcription of user input from a form convenient to the user into a form convenient for use by the computer has any number of applications, including but not limited to word processing programs, document analysis programs, and, as already stated, information retrieval programs. Unfortunately, computerized transcription tends to be error-prone.