This invention relates to word recognition arrangements and, more particularly, to directory assistance retrieval systems incorporating spelled word recognition.
In communication, data processing, and control systems, it is often advantageous to use written or spoken words as direct input for inquiries, data or other information. Such word input arrangements may be utilized to record information, to control machine tools or other apparatus, or to access information from processing equipment. A directory assistance system, for example, may conveniently utilize spoken word inquiries to automatically access stored subscriber information. In one such system disclosed in U.S. Pat. No. 3,928,724 issued Dec. 23, 1975, the spoken spelled last name of a subscriber is used as input to retrieve subscriber information. The spoken characters of the spelled subscriber name are recognized to provide digital data to a computer wherein digital signals representing subscriber names are stored. The input digital data is matched to the stored subscriber digital signals and a message including the desired subscriber information is returned to the inquirer.
As is well known in the art, the variability of a speech signal from speaker to speaker, or even for a particular speaker, limits the accuracy of speech recognition. As a result, the usefulness of a spoken word retrieval system is highly dependent on the accuracy of character and word recognition. Similarly, the usefulness of a written or printed word retrieval system is dependent on the accuracy of character recognition. One type of recognition arrangement, disclosed in U.S. Pat. No. 3,259,883, issued July 5, 1966, uses a dictionary look-up technique to improve recognition accuracy. In the event one or more characters of an input word character sequence are unrecognized, but other characters are accurately recognized, trial characters are substituted for the unrecognized characters. The resulting word is compared to the stored words of the dictionary to derive matching dictionary words. Alternatively, the unrecognized character positions are ignored, and the dictionary words which match the recognized characters are obtained. This dictionary look-up arrangement improves recognition accuracy but requires a time consuming matching process and generally yields more than one choice for the inaccurately recognized word.
Another word recognition system disclosed in U.S. Pat. No. 4,010,445, issued Mar. 1, 1977, discloses the storage of reference words in block units arranged so that characters placed in one or more given positions of an input character string may be used as a retrieval key. Comparison of the input character string, which may include unrecognized character positions, with the keyed blocks allows rapid detection of a stored dictionary word similar to the input character string. While the keyed block arrangement reduces the time required for retrieval, it requires that most characters of the input character string be accurately recognized.
As is well known in the art, input signal characteristics may be similar to precribed characteristics of several stored characters whereby the recognition of the input signal is in doubt. Consequently, the resulting recognized character string includes several possibilities rather than a single recognized character for each input signal. Responsive to the array of possible recognized words, the dictionary look-up techniques previously described provide a large number of equally possible words from which one word must be chosen. In the system disclosed in U.S. Pat. No. 3,533,069, issued Oct. 6, 1970, a best guessed word is selected from all possible trail words formed from the plurality of possible trial characters derived from input signals. Each trial character is assigned an associated priority, i.e., first, second or third choice. Each possible trial word is compared with reference words stored in a context limited dictionary. A best guess dictionary word from the plurality of matching dictionary words is chosen on the basis of the product of the trial word character priorities and the context dependent probability of occurrence of the matching dictionary words. In this manner, the accuracy of character recognition is improved for context limited dictionaries.
In general, dictionary type stores contain large numbers of words which are unrelated and the probability of occurrence of any particular word is the same as any other word. The use of trial word character priorities is not applicable to such context independent dictionary stores since, in the absence of probability factors for dictionary words, only the highest priority trial word is utilized. Directory assistance systems typically include a large number of subscriber names that are not context related and are equally probable. The aforementioned systems therefore, do not provide a basis for detecting a word or subscriber name corresponding to an input character string that is inaccurately recognized.