1. Technical Field
This invention relates to the field of computer speech dictation and more particularly to a method and system for proofreading and correcting dictated text in an electronic document.
2. Description of the Related Art
Speech technologies are continually making the interface between humans and multimedia computers more alive and efficient. Until recently, most dictation products relied on discrete speech. Discrete speech systems restricted dictation to single discretely spoken words having a pause in between each word. The latest products, however, understand continuous speech, so that the user can speak at a more or less normal rate. Continuous speech products, as would be expected, require more computing power than discrete speech recognition products do. There are two categories of PC based continuous speech recognition software: dictation and command recognition. Speech dictation is the most compelling of the two.
An effective speech dictation program possesses the potential for making obsolete the traditional word processor. In contrast to the traditional word processor, in a speech dictation system, the user merely speaks into a microphone or other suitable voice gathering device, and watches the computer magically transform the spoken words into text on screen. When using speech dictation, a user can produce a document essentially without a keyboard using computer based voice recognition. Typically, the user can dictate the bulk of the text directly into the speech dictation system. Thereafter, the user can copy and paste the dictated text directly into a word processor. A few subsequent edits can produce a finished document.
All dictation programs include a dictionary, although the user must add to the dictionary words unknown to the speech dictation program, such as technical terms or proper names. In addition, the speech dictation program can require the user to dictate all punctuation marks, capitalization, and new paragraph breaks. Moreover, the user of a speech dictation system must adopt a dictation style that distinguishes between text and formatting instructions. Some speech dictation systems require the user to dictate text into a proprietary word processor, before cutting and pasting the results into the regular word processing or other application. Other speech dictation systems provide for the direct dictation into particular word processing programs.
There are three major components to the complete speech dictation process: text input, proofreading, and correction. The shift from discrete to continuous dictation has resulted in significant improvement to the speed of text input, from about 70 to 110 words per minute for reading text for transcription. Still, in composing a document using speech dictation, the user must first form the base idea for the document; the user must elaborate or refine that idea; the idea must be described and connected in a coherent form; vocabulary must be carefully chosen; and the grammar, syntax, and the very appearance of words on the page must be carefully prepared. Thus, attempting to publish a document, even if using a speech dictation tool, can prove to involve a great deal of intellectual and manual labor. Additionally, if the manuscript requires revision, the labor involved in proofreading and correction can become repetitive. In consequence, many still produce documents directly, manually performing thousands of keystrokes.
Thus, it is apparent that current speech dictation systems do not effectively address the proofreading and correction components of the speech dictation process. Focus on the proofreading and correction process could otherwise result in a significant reduction in the time required per correction. Hence, an effective proofreading and correction system would significantly improve dictation throughput in terms of correct words per minute. Proofreading, however is a process that is wholly lacking in present computerized speech dictation systems.
The invention concerns a method and system for proofreading and correcting dictated text. The invention as taught herein has advantages over all known methods now used to proofread and correct dictated text, and provides a novel and non-obvious system, including apparatus and method, for proofreading and correcting dictated text. A method of proofreading and correcting dictated text contained in an electronic document comprises the steps of: selecting proofreading criteria for identifying textual errors contained in the electronic document; playing back each word contained in the electronic document; and, marking as a textual error each played back word in nonconformity with at least one of the proofreading criteria.
The selecting step can include specifying a low confidence word threshold below which any word will be identified as a textual error; enabling homonym and confusable word criteria whereby any homonym and confusable word will be identified as a textual error; and, specifying a word grade level above which any word will be identified as a textual error. The selecting step can also include generating a grammar rules check list for reference by a grammar checker; and, enabling grammar checking whereby any word or phrase inconsistent with the grammar rules will be identified as a textual error.
The playing back step can include highlighting each the word contained in the electronic document; and, visually displaying each highlighted word in a user interface. In addition, the displaying step can include visually displaying immediately before the visually displayed highlighted word at least one word preceding the highlighted word in the electronic document; and, visually displaying immediately after the visually displayed highlighted word at least one word succeeding the highlighted word in the electronic document. Moreover, the playing back step can further include providing user voice audio playback using user voice data corresponding to each highlighted word in the electronic document in coordination with the visually displaying step; generating text-to-speech audio playback for each highlighted word in the electronic document not having corresponding user voice data; and, providing the text-to-speech audio playback in coordination with the visually displaying step.
The marking step can comprise manually marking as a textual error each replayed word suspected of violating at least one of the proofreading criteria. In addition, the marking step can include automatically marking as a textual error each replayed word inconsistent with the proofreading criteria. The marking step can further include manually marking as a textual error each replayed word suspected of violating at least one of the proofreading criteria, the manually marking step occurring simultaneous to the automatic marking step.
The method as described herein can further comprise the step of editing each marked textual error identified in the marking step. In particular, the editing step can include reviewing each marked textual error identified in the marking step; accepting user specified changes to each marked textual error reviewed in the reviewing step; and, unmarking each marked textual error corrected by the user in the accepting step. Also, the reviewing step can include highlighting each word in the electronic document corresponding to the marked textual error marked in the marking step; and, displaying an explanation for each marked textual error in a user interface. Moreover, the reviewing step can further include suggesting a recommended change to the marked textual error; displaying the recommended change in the user interface; and, accepting a user specified preference to substitute the recommended change for the marked textual error. The editing step can further include, the step of unmarking each marked textual error corresponding to a user command to unmark the marked textual error.
An electronic system for proofreading and correcting dictated text in an electronic document can comprise: a proofreading tool for identifying and correcting textual errors in the electronic document; a proofreading options interface for storing proofreading criteria for use with the proofreading tool; and, a control panel for interacting with the proofreading tool. The electronic system can further comprise a voice command processor for controlling the user interface.
The proofreading tool can include a playback system for playing back the dictated text; a marking tool for identifying and marking textual errors contained in the dictated text; and, a mark processor for editing the marked textual errors identified by the marking tool. Specifically, the playback system can include a highlighter for sequentially distinguishing each word contained in the dictated text; means for providing user voice audio playback for the distinguished words having corresponding user voice data; and, a text-to-speech generator for producing audio playback for distinguished words not having corresponding user voice data required by the user voice audio playback means.
The marking tool can include any combination of the following three components. In one embodiment, the marking tool can have an automated marking tool for automatically identifying and marking textual errors exceeding thresholds specified in the proofreading criteria. In another embodiment of the present invention, the marking tool can have a manual marking tool for manually identifying and marking a textual error in response to a user command to mark the textual error. In yet another embodiment, the marking tool can further include the automated marking tool and the manual marking tool whereby the automated marking tool can operate concurrent to the manual marking tool. Moreover, in yet another embodiment, the marking tool can further include a grammar checker for identifying grammatical errors contained in the electronic document.
The mark processor can comprise a highlighter for sequentially distinguishing each word contained in the dictated text identified and marked as a textual error by the marking tool; an explanation engine having explanations for each textual error; messaging means for transmitting the explanations to the control panel; and, means for editing the textual error. The mark processor can further include a suggestion engine having suggested corrections to each textual error; and, messaging means for transmitting the suggested corrections to the control panel.
The proofreading options interface can include a low confidence word control for specifying a low confidence word threshold below which any word will be identified as a textual error; a homonyms and confusable words switch for enabling the marking of homonyms and confusable words; and, a word grade level control for specifying a word grade level above which any word will be identified as a textual error. In another embodiment, the proofreading options interface can include a grammar rules control interface containing grammar rules for reference by a grammar checker; and, a grammar rules switch for enabling marking of words or phrases inconsistent with the grammar rules by the grammar checker.
The control panel can include a mark problems view for controlling the marking tool; and, a work with marks view for controlling the mark processor. The mark problems view can include a playback speed control for prescribing a rate of playback by the playback system; a pause button for accepting a command to pause the playback of the dictated text; a mark button for accepting a user command to manually mark the displayed word as a textual error; message transmitting means for transmitting the mark command to the marking tool; message transmitting means for transmitting the prescribed rate and the pause command to the playback system; message receiving means for receiving each word played back by the playback system; and, a text window for displaying each word received by the message receiving means.
The work with marks view can include message receiving means for receiving data from the mark processor; and, a status line for displaying an explanation generated by an explanation engine and received by the message receiving means. The work with marks view can further include a suggestion panel for displaying a suggested correction generated by a suggestion engine and received by the message receiving means; a suggestion button for accepting a user specified preference to substitute the suggested correction for the marked textual error; and, message transmitting means for transmitting the substitution preference to the mark processor.