Recently, there has been increased interest in software products that expedite common-place tasks found in the workplace and that make workers more efficient in doing their jobs. One such area of office-productivity software is related to voice-recognition software. Voice-recognition software attempts to stream-line the word-processing process by converting spoken words to a text file without requiring a user or assistant to manually type the words into a document. Voice-recognition software is also known as speech recognition software, voice transcription software, or dictation software.
One example of voice-recognition software is Via Voice voice-recognition software available from International Business Machines Corporation (IBM) of Armonk, N.Y. Another commercially available voice-recognition software is Dragon Naturally Speaking voice-recognition software, which is available from Dragon Systems, Inc. of Newton, Mass.
Initially, the voice-recognition software generated a text file that often had many mistakes. Some users found that it took longer to correct the errors in the dictated text than to type or have someone else type the document from scratch. However, in the past several years there have been significant strides made in improving the accuracy of the voice-recognition software through the use of training files, more sophisticated speech recognition algorithms, and more powerful computer systems.
With the increasing use of mark-up languages, such as the Extensible Markup Language (XML), it is desirable for there to be a mechanism that provides the capability of adding mark-up tags to a document in an efficient, easy-to-use, effective, and user-friendly manner.
Unfortunately, marking up the dictated text with voice-recognition software has much to be desired. Currently, marking up the dictated text can be performed manually, which is simply generic word processing, or can be performed by verbal commands spoken by the user. For example, a user might speak “New Paragraph” to start a new paragraph and “New Line” to start a new line. Similarly, formatting commands are employed to apply a specified format, such as Bold, Italics, and Underline to dictated text during dictation or during review of dictated text.
One disadvantage of such systems is that commands to process the transcribed text are often misunderstood by the system, thereby injecting mistakes into the dictation process and causing user frustration. For example, the dictation system may mistakenly interpret a command as a word to be inserted into the document or may mistakenly interpret a word to be inserted into the document as a command. Furthermore, the dictation system may confuse two commands that sound alike, such as “Italics” and “Initial Cap”.
Moreover, such prior art systems do not allow a user to vary commands based on predefined contexts. In fact, such systems are not context sensitive (i.e., there is no mechanism to vary the meaning of a command when a context changes). For example, when a user speaks a command, such as “center”, the command always will mean center the current word regardless of the type of document. Unfortunately, since the type of document often varies widely in terms of intended audience, content of the message, tone of the message, etc. (hereinafter referred to as “context”), users may need a single command to mean different things depending on the specific context for the document. Consequently, it would be desirable for a dictation system to have a reliable and efficient mechanism to affect the document and apply one or more changes to the document based on a common command and the context of the document.
Based on the foregoing, there remains a need for a tone-based mark-up dictation method and system that overcomes the disadvantages set forth previously.