1. Technical Field
This invention relates to the field of speech recognition software and more particularly to replacing text within a body of text with alternate text selections.
2. Description of the Related Art
Speech recognition is the process by which an acoustic signal received by microphone is converted to a set of text words by a computer. These recognized words may then be used in a variety of computer software applications for purposes such as document preparation, data entry, and command and control.
One important area where speech recognition technology has proved to be very useful is the conversion of spoken utterances into text for inclusion in a word processing document. Word processing applications may have incorporated therein a speech recognition function. Alternatively, a variety of speech recognition programs are commercially available which provide the speech recognition function to existing non-speech enabled word processing applications. In any case, the ability of these speech enabled word processing applications to convert human speech into text has improved dramatically in recent years. Due to a variety of factors, however, errors may yet occur in the speech recognition function performed by such applications. Accordingly, it is often necessary for a user to review a document which has been dictated to a word processing application by means of such speech recognition software.
One notable aspect of text errors existing in a speech enabled word processing application is that they tend not to be misspelled, but are instead words which have been xe2x80x9cmisrecognizedxe2x80x9d. Such text errors are typically a close acoustic match to the correct text, but often involve spelling variations or in some instances may be an entirely unrelated but similar sounding word or words.
Another source of error within text, completely unrelated to speech recognition systems, is user error. User errors may include, but are not limited to, misspelling of text within a body of text, grammatical errors within a body of text, or incorrect keystrokes from a user keyboard entry.
A spell check function within a word processing program is an example of a conventional system of correcting errors within a body of text. In the case of a spell check function, the system is initiated by a first user input selecting for replacement either a potentially misspelled word or a grammatically incorrect portion of text. This step is commonly performed with a series of keyboard entries or a pointing device such as a mouse. For example, a keyboard or a mouse is manipulated by the user to highlight text within a body of text. Next, in response to a second user input, a list of alternate text selections is displayed to the user. The user then selects an appropriate alternate text selection from the displayed list of alternate text selections. In response to the user""s selection, the system inserts the user selected alternate text selection in place of the selected text.
Other conventional spell check functions allow the user to select a correction option to initiate spell checking throughout an entire word processing document. In this case the user does not select text for replacement, instead the system searches the body of text for potential spelling and grammatical errors made by the user. Once the system identifies a potential spelling or grammatical error, the system displays a list of potential alternative text selections to the user. Similar to the previously described spell check function, the user then selects an appropriate alternate text selection from the displayed list of alternate text selections. The user selected alternate text selection is then inserted into the body of text in place of the selected text.
Although conventional systems of correcting errors within a body of text have functioned reasonably well in the past, there are a number of disadvantages inherent to such systems. One such disadvantage is that conventional systems utilize a visual interface. By using a visual interface, conventional systems must employ a large display device such as a computer monitor. The need for a large display device severely limits the ways in which a conventional system of correcting errors within a body of text can be incorporated into other systems and existing technologies. A large display device further limits the environments in which conventional systems can be used.
Another disadvantage of conventional systems for replacing text within a body of text is that such systems must display the list of alternate text selections to the user. For example, a list of alternate text selections is usually displayed to the user on a computer monitor via a window or other pop-up style computer dialog box. This can result in an excessive number of open windows on the computer monitor leading to window xe2x80x9cclutterxe2x80x9d and obstruction of the main window containing the text being edited.
Moreover, the need to have several open windows at once within a conventional system for replacing text within a body of text demands a sizable display device. The display device must be large enough for the user to comfortably view several alternate text selections simultaneously with the text being edited. However, size limitations of common display devices force information to be presented to the user in a crowded and cluttered fashion. Consequently, as mentioned before, the alternate text selections usually obscure the user""s view of the text being edited. As a result, there has arisen a need for a more efficient way to replace text within a body of text with alternative text selections.
The invention concerns a method and system for automatically correcting portions of text. The method of invention involves a plurality of steps including: receiving text derived from a first user input for inclusion in a body of text; concurrently upon receipt of the first user input, and based upon the first user input, identifying a list of alternate text selections potentially intended by the user; storing each of the alternate text selections in a memory location associated with the text; and in response to a second user input, and without displaying the list of alternate text selections to the user, automatically retrieving a first one of the alternate text selections from the memory location and inserting the first one of the alternate text selections in place of the text in the body of text. In response to a third user input, the method can further include the additional step of automatically replacing the first alternate text selection in the body of text with a second one of the alternate text selections in response to a third user input.
Additionally, in response to a fourth user input, the invention can include the step of replacing the second alternate text selection in the body of text with at least one of the text and the alternate text selections, which has previously been included in the body of text. For example, in response to a user input, the invention would replace an alternate text selection with a previously used alternate text selection, or alternatively, replace an alternate text selection with the original text.
Although the invention can accept a variety of suitable user inputs, one advantageous embodiment can accept user input in the form of a spoken utterance. In this embodiment, the text and the alternate text selections can be derived from the spoken utterance by a speech recognition engine. Another embodiment of the invention can derive the text from a user keyboard entry. In yet another embodiment of the invention, the second user input can include selecting the text to be replaced, and articulating a spoken command for requesting replacement of the text with one of the alternate text selections.
According to a second aspect, the invention can be a system for automatically correcting portions of text in a computer speech recognition system. In that case, the system includes: programming for receiving text derived from a first user input for inclusion in a body of text; programming for identifying a list of alternate text selections potentially intended by the user concurrently upon receipt of the first user input, and based upon the first user input; programming for storing each of the alternate text selections in a memory location associated with the text; and in response to a second user input, programming for automatically retrieving a first one of the alternate text selections from the memory location, and inserting the first one of the alternate text selections in place of the text in the body of text without displaying the list of alternate text selections to a user.
Additionally, in response to a third user input, the system can include programming for automatically replacing the first alternate text selection in the body of text with a second one of the alternate text selections. Further, in response to a fourth user input, the system preferably includes programming for replacing the second alternate text selection in the body of text with at least one of the text and the alternate text selections, which has previously been included in the body of text.
Similar to the previously described method, the system can include programming to accept a variety of suitable user inputs wherein each of the user inputs may include a spoken utterance or a user keyboard entry. In the case of a spoken utterance user input, the text and the alternate text selection can be derived from the spoken utterance by a speech recognition engine. In another embodiment where the system automatically retrieves the first one of the alternate text selections and inserts the first one of the alternate text selections in place of the text in the body of text, the system can include programming which allows the user to select the text in the body of text, and articulate a spoken command for requesting replacement of the text with one of the alternate text selections.
Finally, the invention may take the form of a machine-readable storage having stored thereon a computer program having a plurality of code sections executable by a machine for causing the machine to perform a set of series of steps. These steps can include: receiving text derived from a first user input for inclusion in a body of text; concurrently upon receipt of the first user input, and based upon the first user input, identifying a list of alternate text selections potentially intended by a user; storing the alternate text selections in a memory location associated with the text; and in response to a second user input, and without displaying the list of alternate text selections to a user, retrieving a first one of the alternate text selections from the memory location and replacing the text with the first one of the alternate text selections.
The machine-readable storage, in response to a third user input, may also cause the machine to perform the further step of replacing the first alternate text selection in the body of text with a second one of the alternate text selections. In response to a fourth user input, the machine-readable storage can also be programmed for causing the machine to perform the additional step of replacing the second alternate text selection in the body of text with at least one of the text and the alternate text selections which has previously been included in the body of text. The machine-readable storage can further be programmed for causing the machine to perform the additional step of deriving the text and each of the alternate text selections from a spoken utterance by a computer speech recognition engine, and alternatively, deriving the text from a user keyboard entry.