The transfer of verbal dictation to a concise written format is an integral part of business in many parts of society. For instance, due to the increasing amounts of audio medical records, the medical transcription industry is currently estimated to be a multibillion dollar industry. With the steady increase in size and complexity of healthcare and the desire to minimize costs associated with routine practices, there is a large push to automate routine practices, such as dictation and automatic speech recognition (ASR).
The final documents generated by transcription services differ greatly from the initial ASR output due to a number of inherent problems. Briefly, in addition to problems with the doctor's speech and common ASR problems (e.g., disfluencies, omission of function words, and wrong word guesses), there are conventions used in the final document which are generally not dictated (e.g., section headings, preamble, enumerated lists, medical terminology, and various pieces of additional structure). Traditional ASR has not focused on some of these issues, which are extremely important in fields such as medical transcription that have a specific format and high degree of specialization.
Additionally, there are a number of reasons that it is important to know the document type of a transcription job when. It helps to know which template should be used for transcribing the document, which workflow rules should be used, and to make sure the transcriptionist is qualified to transcribe the content. When performing speech recognition, knowing which document type the job is can help narrow the language model, assist in heading detection, and assist in punctuation insertion. For this reason, it is beneficial to know the document type prior to any speech recognition or transcription being performed on a job.