1. Field of the Invention
The present invention relates to editing documents and, more particularly, to detecting and correcting errors in draft documents produced using an automatic document transcription system or other means.
2. Related Art
It is desirable in many contexts to generate a structured textual document based on human speech. In the legal profession, for example, transcriptionists transcribe testimony given in court proceedings and in depositions to produce a written transcript of the testimony. Similarly, in the medical profession, transcripts are produced of diagnoses, prognoses, prescriptions, and other information dictated by doctors and other medical professionals. Transcripts in these and other fields typically need to be highly accurate (as measured in terms of the degree of correspondence between the semantic content (meaning) of the original speech and the semantic content of the resulting transcript) because of the reliance placed on the resulting transcripts and the harm that could result from an inaccuracy (such as providing an incorrect prescription drug to a patient). It may be difficult to produce an initial transcript that is highly accurate for a variety of reasons, such as variations in: (1) features of the speakers whose speech is transcribed (e.g., accent, volume, dialect, speed); (2) external conditions (e.g., background noise); (3) the transcriptionist or transcription system (e.g., imperfect hearing or audio capture capabilities, imperfect understanding of language); or (4) the recording/transmission medium (e.g., paper, analog audio tape, analog telephone network, compression algorithms applied in digital telephone networks, and noises/artifacts due to cell phone channels).
The first draft of a transcript, whether produced by a human transcriptionist or an automated speech recognition system, may therefore include a variety of errors. Typically it is necessary to proofread and edit such draft documents to correct the errors contained therein. Transcription errors that need correction may include, for example, any of the following: missing words or word sequences; excessive wording; mis-spelled, -typed, or -recognized words; missing or excessive punctuation; and incorrect document structure (such as incorrect, missing, or redundant sections, enumerations, paragraphs, or lists).
Furthermore, formatting requirements may make it necessary to edit even phrases that have been transcribed correctly so that such phrases comply with the formatting requirements. For example, abbreviations and acronyms may need to be fully spelled out. This is one example of a kind of “editing pattern” that may need to be applied even in the absence of a transcription error.
Such error correction is typically performed by human proofreaders and can be tedious, time-consuming, costly, and itself error-prone. Furthermore, many error patterns occur frequently across documents and the necessity to repeatedly correct them may create a significant level of discontent among proofreaders. What is needed, therefore, are improved techniques for correcting errors in draft documents.