The invention relates to computer-implemented speech recognition.
A typical speech recognition system includes a recognizer and a stored vocabulary of words which the recognizer is capable of recognizing. The recognizer receives information about utterances by a speaker and delivers a corresponding recognized word or string of recognized words drawn from the vocabulary. The stored vocabulary often includes additional information for each of the vocabulary words, such as the word""s part of speech (e.g., noun, verb, adverb).
In German, consecutive words in a sentence are frequently concatenated to form compound words. For example, referring to FIG. 1a, in the string of spoken words xe2x80x9cer hxc3x6rt daB der President Wahl Kampf Geschichten geschrieben hatxe2x80x9d 8 (which, translated into English, is xe2x80x9che hears that the president has written election campaign storiesxe2x80x9d), the words xe2x80x9cWahl,xe2x80x9d xe2x80x9cKampf,xe2x80x9d and xe2x80x9cGeschichtenxe2x80x9d would be combined to form the compound word xe2x80x9cWahlkampfgeschicten.xe2x80x9d
Some German speech recognition systems place frequently used compound words in the stored vocabulary to enable them to recognize those words using standard recognition techniques. Other German speech recognition systems are trained with text containing compound words. During training, such systems identify compounds words in the text and also identify the constituent words which make up the compound words. During recognition of German speech, such systems form compound words by concatenating words which were previously identified as making up compound words in the training text.
In one aspect, a computer is used to improve recognition of a text string including words in a language (e.g., German) having associated parts of speech. The text string is analyzed with respect to information about expected patterns of the parts of speech in the language and modified based on the analysis. The information may include rules descriptive of combinations of parts of speech in the language corresponding to compound words in the language. The combinations of parts of speech may be sequences of parts of speech.
Analyzing may include comparing the combinations of parts of speech to parts of speech associated with the words in the text string and indicating that a compound word should be formed from the words associated with the matched parts of speech if at least one of the combinations of parts of speech matches parts of speech associated with the words. Modifying the text string may include forming a compound word from words in the text string. The compound word may be added to a vocabulary.
Modifying the text string may include replacing words in the text string with the compound word. The modified text string may be added to a list of candidate text strings. The text string may be analyzed with respect to rules descriptive of other, unpreferred combinations of parts of speech in the language corresponding to combinations of words which do not typically form compound words in the language and it may be indicated that a compound word should not be formed from the words associated with the matched parts of speech if at least one of the unpreferred combinations of parts of speech matches parts of speech associated with the words. The unpreferred combinations of parts of speech may correspond to combinations of groups (e.g., pairs) of parts of speech, with the groups corresponding to phrases.
The compound word may be added to a compound word cache. Adding the compound word may include increasing the frequency count of the compound word in the compound word cache. The compound word also may be added to a vocabulary.
The text string may be analyzed with respect to agreement rules descriptive of patterns of agreement of case, number, and gender of words corresponding to combinations of words which do not typically form compound words in the language, and it may be indicated that a compound word should not be formed from the matching words if at least one of the agreement rules matches words in the text string.
The agreement rules may include a rule indicating that if a noun in a subordinate clause matches the case, number, and gender of a preceding determiner, a compound word should not be formed from the noun and subsequent words in the subordinate clause. The agreement rules may include a rule indicating that if a noun in a non-subordinate clause matches the case, number, and gender of a preceding determiner, a compound word should not be formed from words in the noun phrase containing the noun and words subsequent to the noun phrase.
The compound word may be identified as an incorrect compound word, and the compound word may be added to a compound word error cache. Adding the compound word to the compound word error cache may include increasing a frequency of the compound word in the compound word error cache. If the compound word has been identified as an incorrect compound word, it may be indicated that the compound word should not be formed from the words associated with the matched parts of speech. The compound word may be identified as an incorrect compound word in response to action of a user by adding the compound word to a compound word error cache. It may be indicated that the compound word should not be formed from the words associated with the matched parts of speech if the compound word has been identified as an incorrect compound word more frequently than the compound word has not been identified to be an incorrect compound word.
Among the advantages of the invention are one or more of the following.
Use of language-specific compounding rules to recognize compound words allows recognition of compound words which are not in the stored vocabulary. A speech recognition system that is capable of recognizing compound words may, therefore, use a stored vocabulary which contains only ordinary (non-compound) words, or which contains only a small number of frequently-used compound words. Reducing the number of compound words that are stored in the stored vocabulary reduces the amount of time and effort needed to generate the vocabulary and reduces the total size of the vocabulary. The ability to recognize compound words not stored in the vocabulary also potentially increases the total number of recognizable compound words. Reduction in vocabulary size may also result in increased recognition speed. Furthermore, the space that is saved may be used for other purposes, such as storing domain-specific vocabularies.
Use of compounding rules to recognize compound words also facilitates modification of the speech recognition system""s compound word recognition capabilities. The set of compound words recognized by the speech recognition system may be changed by adding, deleting, or modifying the compounding rules, rather than by modifying the stored vocabulary. This feature also facilitates addition of compound word recognition capabilities to existing speech recognition systems.
The techniques may be implemented in computer hardware or software, or a combination of the two. However, the techniques are not limited to any particular hardware or software configuration; they may find applicability in any computing or processing environment that may be used for improvement of speech recognition. Preferably, the techniques are implemented in computer programs executing on programmable computers that each include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and one or more output devices. Program code is applied to data entered using the input device to perform the functions described and to generate output information. The output information is applied to the one or more output devices.
Each program is preferably implemented in a high level procedural or object-oriented programming language to communicate with a computer system. However, the programs can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language.
Each such computer program is preferably stored on a storage medium or device (e.g., CD-ROM, hard disk or magnetic diskette) that is readable by a general or special purpose programmable computer for configuring and operating the computer when the storage medium or device is read by the computer to perform the procedures described in this document. The system may also be considered to be implemented as a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner.
Other features and advantages of the invention will become apparent from the following description, including the drawings, and from the claims.