Keyboard activities such as “touch typing” have been available to create a text record of a user's thoughts. That is, a user manually types in the user's thoughts on a keyboard or similar device to memorialize those thoughts in an electronic file or hard paper file or the like. For those who do not type as fast and as accurately as their thoughts, as well as for those users unable to use a keyboard for any such reason, such keyboard-inputting, or “touch typing”, may not be desirable.
Devices have been available which purportedly allow users to input thoughts into a electronic file or hard paper file without use of a keyboard or similar physical inputting interface. That is, devices and/or processes have been devised which will convert the spoken word into text without use by the user of a keyboard. Some companies which purportedly offer such devices and/or processes include IBM with “Via Voice,” and Lemout and Hauspie with “Dragon Dictate”. Such software recognition programs may be installed on a personal computer. In some cases, a remote computer may be used and the input sounds are purportedly coupled to that remote computer by telephone lines and/or radio frequency reception (such as via cellular telephones or other wireless communications devices). To furnish input to the voice recognition programs, an available practice is to use a microphone mounted on one side of the head, usually coupled to an earphone by a “boom” arm. The headphone and frame are positioned upon the head using a headband under tension, either top mount or rear mount. The microphone is to be positioned by adjusting the boom so that the “mike” is close to the user's mouth.
Some software programs provide feedback to the user that the microphone is positioned that “good quality” sound is being picked up by the microphone. Usually, this means that the observed signal level meets or exceeds a manufacturer-chosen threshold level that might provide the voice recognition circuitry enough signal to properly make algorithmic decisions as to what words are being spoken.
Oftentimes, the software manufacture provides a series of written training pages of text which are to be read aloud by the prospective user into the voice recognition system to “train” that system to help to make correct choice pairings of observed sound patterns and the furnished text samples. This training may be successful depending upon user patience during the training process and training adequacy as to how many paragraphs of text should be read into the system.
The microphone position will most often vary somewhat between the initial training session and the actual use sessions, either because the headphone (with the microphone boom attached) has been removed and replaced, as for eating or drinking, or just the microphone boom is pushed out of position in relation to the earphone for these activities. Because of these difficulties, the accuracy of transcription of a voice recognition system seldom exceeds 90%, that is, there are errors in transcription by the voice recognition system during a nominal 10% of the spoken input. This means that the user must correct this nominal 10% of the output text derived from the voice recognition sessions. The correction can purportedly be made by speaking individual letters, defining what would have been the correct text generated, or the correction can be made by typing, if a keyboard is available to the user. If the error fraction exceeds much over 10%, the “after speaking” correction of the voice recognition system output can become time burdensome for the user, and the system may be considered as a nuisance, that is, not useful in a practical sense.