(Not Applicable)
(Not Applicable)
The field of the invention is speech dictation. More particularly, the invention relates to software-implemented speech dictation using general libraries and auxiliary topic libraries.
Speech dictation methods implemented through software typically search for matches to spoken words in a xe2x80x9cgeneral libraryxe2x80x9d database associated with the software. The general library includes words that are commonly used in the language of interest, but may not include words that are germane to specialized topics. For example, a general library may not contain a full range of words relating to topics such as specialized technical fields, medical fields, activities or hobbies having distinctive vocabularies, or ethnic jargon. In the area of cooking, for example, a general library may not include words or phrases such as xe2x80x9cau poivrexe2x80x9d or xe2x80x9cal dente.xe2x80x9d Because the speed of a dictation system is proportional to the size of the word database that must be searched, it is impractical for a general library to include every word that may be spoken.
Because general libraries do not contain all specialized words, they may not recognize a specialized word, or may identify the word incorrectly. Prior art systems have overcome some of the misrecognition or non-recognition problems associated with the limitations of a general library by enabling the dictation system user to activate xe2x80x9cauxiliary topic libraries.xe2x80x9d As the name implies, these libraries consist of separate databases that are searched independently from the general library. In addition, each library includes words commonly associated with a particular topic. For example, some topics might be: electrical engineering, astronomy, cooking, art, or internal medicine.
As stated previously, the speed of the dictation process is proportional to the number of words the system must search in order to match a spoken word with a word in the databases being searched. Because topic libraries often include words that are not commonly used, it is inefficient for a dictation system always to search all available topic libraries. Therefore, prior art systems have provided users the ability to activate and deactivate available topic libraries. When a user plans to speak on a topic such as cooking, for example, the user would activate the auxiliary topic library relating to cooking. When the user no longer plans to speak on the topic, the user could deactivate the library in order to speed the dictation process.
Prior art systems require the user to take action to activate or deactivate an auxiliary topic library. If the user forgets to activate a particular topic library, the system may have low recognition accuracy. If the user forgets to deactivate a particular topic library, the dictation system may be inefficient, as it must search through more words than necessary in order to determine a match.
What is needed is a method and apparatus for automatically activating and deactivating auxiliary topic libraries. What is further needed is a method for activating and deactivating auxiliary topic libraries that is user-friendly, and results in efficient and accurate speech recognition and dictation.
A method for dictating speech compares a spoken word of input speech with words in one or more active libraries. If the spoken word is recognized as being a word from an active library, the spoken word is dictated to be that word, and the method processes another word. If the spoken word is not recognized as being a word from an active library, the method compares the spoken word to words within one or more inactive libraries. If the spoken word is recognized as being a word from an inactive library, the method automatically activates the corresponding inactive library, and dictates the spoken word to be the word from the previously-inactive library. The method also automatically deactivates an active library if a sufficient number of spoken words have occurred that have not been recognized in the active library.
The method can be executed by a machine that executes a plurality of code sections of a computer program that is stored on a machine readable storage.
An apparatus for dictating speech includes a microphone, an analog to digital converter, a processor, and a memory device. The microphone receives the input speech, and the analog to digital converter converts the input speech to digital speech samples. The processor receives a block of the digital speech samples that represents a spoken word, and compares the spoken word to words within the active and, if necessary, the inactive libraries. Based on that comparison, the processor may automatically activate an inactive library. The processor also dictates the spoken word. The memory device stores the active and inactive libraries.