The present invention relates generally to the field of speech recognition, and more particularly to speech recognition training including acoustic model training.
Speech recognizers convert speech (i.e., spoken language) to written language and typically use an acoustic model to represent the relationship between an audio signal and the phonemes or other linguistic units that make up speech. Typically, acoustic models are created from training data that includes a set of audio recordings and their corresponding transcripts.