The present disclosure is related to the field of the transcription of audio data. More specifically, the present disclosure is related to speaker separation in diarizing audio data for transcription.
Audio data is often received as a single mono audio file although there are multiple speakers within the audio file. If the audio file can be segmented into audio data attributed to separate speakers, then further analysis techniques can be employed that are specifically directed to transcribing the speech of one of the speakers. This more focused approach can result in more accurate transcription. Therefore, there is a need for improved speaker separation within audio data.