Transcription in the linguistic sense is a systematic representation of language in written form. The source of a transcription can either be utterances (e.g., speech or sign language) or preexisting text in another writing system.
In the academic discipline of linguistics, transcription is an essential part of the methodologies of phonetics, conversation analysis, dialectology and sociolinguistics. It also plays an important role for several subfields of speech technology. Common examples for transcription use employed outside of academia involve the proceedings of a court hearing, such as a criminal trial (by a court reporter), a physician's recorded voice notes (medical transcription), aid for hearing impairment personas, and the like.
Recently, transcription services have become commonly available to interested users via various online web sources. Examples for such web sources include rev.com, transcribeMe®, and similar services where audio files are uploaded and distributed via a marketplace to a plurality of individuals who are either freelancers or employed by the web source operator to transcribe the audio file. However, it can be difficult to properly analyze an audio file in an automated fashion. These audio files are heterogeneous by nature in regards a speaker's type, accent, background noise within the file, context, and subject matter of the audio. Thus, assessing the file contents and determining the cost and effort required for proper transcription often involves human involvement, which can be time consuming, inefficient and costly. Therefore, the task of optimally assessing an audio file for transcription purposes in a more efficient manner is desired.