The transfer of verbal dictation to a concise written format is an integral part of business in many parts of society. For instance, due to the increasing amounts of audio medical records, the medical transcription industry is currently estimated to be a multibillion dollar industry. With the steady increase in size and complexity of healthcare and the desire to minimize costs associated with routine practices, there is a large push to automate routine practices, such as dictation and automatic speech recognition (ASR).
The final documents generated by transcription services differ greatly from the initial ASR output due to a number of inherent problems. Briefly, in addition to problems with the doctor's speech and common ASR problems (e.g., disfluencies, omission of function words, and wrong word guesses), there are conventions used in the final document which are generally not dictated (e.g., section headings, preamble, enumerated lists, medical terminology, and various pieces of additional structure). Traditional ASR has not focused on some of these issues, which are extremely important in fields such as medical transcription that have a specific format and high degree of specialization.
Additionally, common phrases are identified to improve transcription. Common phrases are certain phrases which appear together very frequently in certain sets of documents. When speech recognition is performed, parts of the output are seen which are very close to these common phrases, but may have a few words incorrect. Rather than allowing errors to proceed, the common phrases aspect of the disclosure will correct errors to what they are in the common phrase.