1. Field of the Invention
The present invention relates to speech recognition and, more particularly, to a method and system for enabling and disabling subvocabularies for recognizing speech.
2. Description of the Related Art
Conventional speech recognition systems have had difficulties related to outputting words that are deemed undesirable, such as obscenities or ethnic slurs. (It is important to note that there is no universally accepted set of undesirable words.) The conventional systems typically attack theses problems in one of two ways. In a first technique, the undesirable words are left in a vocabulary in the hope that the words would only appear if and when an end user has spoken the words. Although this technique is adequate most of the time, the results may be undesirable. For example, in a speech recognition system which has word completion capabilities such as those systems that include a keyboard-correction capability by an end user, all words to be completed that begin with a certain letter sequence are displayed thereby displaying undesirable words.
In a second technique, undesirable words are deleted from the vocabulary altogether. One problem with this technique is that every end user who wants some or all of these words to be recognized must resupply the words individually. This is typically performed by an add-word utility in a speech recognition software package. Further, the system employing the second technique needs to generate (possibly incorrect) baseforms for each of the words on-the-fly for future recognition making the second technique cumbersome, time-consuming and error-prone.
Therefore, a need exists for a system and method for enabling and disabling subvocabularies to provide a more versatile method for outputting or withholding from output words which are considered undesirable.
A method for designating a subvocabulary for speech recognition systems includes the steps of providing a vocabulary of words each having a flag with a first value, selecting words to be eliminated from the vocabulary, setting the flags of the selected words to a second value and processing speech based on words having the flag set to the first value.
A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for disabling and enabling of subvocabularies in speech recognition systems, the method steps include providing a vocabulary of words each having a flag with a first value, selecting words to be eliminated from the vocabulary, setting the flags of the selected words to a second value and processing speech based on words having the flag set to the first value.
In alternate methods which may be executable by the program storage device, the step of selecting words may include the step of selecting a group of words grouped together in at least one subvocabulary. The at least one subvocabulary may include a subvocabulary having words directed to a particular subject (including objectionable or undesirable subject matter such as profanity, slang, racial slurs, sexual or violent content, etc.) and/or a particular reading level. The flag is preferably a single bit and the first and second values may be zero or one or vice versa. The step of permitting access to selecting words by providing a password may be included. The step of reselecting words having the flag with the second value to return the flag to the first value thereby including the words in the vocabulary may also be included. The step of eliminating words having the flag with the second value from speech recognition processes may be included. The program storage device may further include the steps of selecting word combinations to be eliminated from the vocabulary and checking speech recognized word combinations to eliminate the selected word combinations from being processed.
Another method for removing words from speech recognition systems includes the steps of providing a vocabulary of words each having a flag value, grouping the flag values to form at least one subvocabulary of words, selecting subvocabularies to be eliminated from the vocabulary, setting the flags of the selected subvocabularies to be different from the flag value of the words in the vocabulary and processing speech based on words having the flag set to the value of the vocabulary.
In alternate methods, the step of selecting subvocabularies may include the step of selecting additional words to be eliminated from the vocabulary. At least one subvocabulary may include words directed to a particular subject and/or a particular reading level. The flag may include a binary representation of a number. The method may further include the step of permitting access to selecting subvocabularies by providing a password. The method may further include the step of reselecting subvocabularies and words to change the flag value. The method may further include the step of eliminating words having the flag with the second value from speech recognition processes. The steps of selecting word combinations to be eliminated from the vocabulary and checking speech recognized word combinations to eliminate the selected word combinations from being processed may also be included.
These and other objects, features and advantages of the present invention will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.