Speech to text translation systems typically include a variety of different audio sources producing a multitude of individual speech records. These speech records are translated to text using any of a wide variety of methods. Sometimes an utterance may be translated into two different words and the speech to text translation system must decide which of the translations is correct.
Often probabilities are determined to represent the probability of correct translation for each word or utterance. Words with a low probability of correct translation may be re-processed using a different speech to text translation method or may be flagged for later processing. Metadata may accompany the speech records and a review of the metadata may be useful in determining the correct translation. For example, the metadata may include identities of the speakers which would allow inferences to be made about the type of speech within the record and may be used to adjust the probabilities of correct translation based on the information within the metadata.