The present invention relates to techniques for identifying audio content in a data stream received by a computer system.
Voice-recognition techniques, such as the techniques used by voice-recognition software in call centers, are becoming increasingly popular. These techniques facilitate a variety of applications by enabling users to provide verbal information to computer systems. For example, automated transcription software allows users, such as healthcare providers, to dictate voice messages that are subsequently converted into text.
However, the performance of existing voice-recognition applications is often highly sensitive to audio quality. Consequently, these applications are often optimized for use with high-quality audio data. Unfortunately, the quality of the audio data which is used by many applications, such as the quality of the audio data received by handheld devices that communicate via wireless communication, varies considerably. For example, the audio quality associated with cellular telephones can vary considerably from one phone call to another or even as a function of time within the same phone call. This variability often limits the usefulness of existing voice-recognition techniques with such handheld devices.