As teleconferencing and mobile communication technologies have become more widespread, the delivery of optimal sound quality in such systems has been the subject of much research. In these applications, secondary noise sources and effects of the acoustic environment can degrade the quality of the transmitted speech. In general, the problem of speech enhancement for communication can be at least divided into three distinct, but related, tasks: noise suppression, echo cancellation, and dereverberation. In each case, the goal may be to enhance speech intelligibility. While attempts have been made to develop algorithms for the task of dereverberation, it may be likely that a different approach may be required.
One approach in dereverberation can be to filter a signal with some approximation of the inverse filter of the acoustic space in which the audio was recorded. This can work nearly exactly if the acoustic space impulse response can be both known in advance and is minimum phase. In practice, however, it is unlikely that these conditions will be met. It also may be rare that one has access to an impulse response measurement, and even when an impulse response is available, it seldom is minimum phase. It may, therefore, require some form of approximation to be made in estimating the inverse filter.
There is a class of methods that make use of multiple microphones to accomplish dereverberation. When recordings from multiple microphones distributed in an acoustic space are available, a degree of reverberation removal can be accomplished by identifying the direction of arrival of the direct signal and interfering signals coming from other directions. However, multiple microphone recordings may rarely be available.
The reduction of outside noise sources often may be achieved by spectral subtraction methods. These methods may rely on an estimate of the background noise spectrum gathered during a pause in audio activity and a subtraction of the noise spectrum from the spectrum of the reverberant recording. This may work for the case of stationary broadband noise. In contrast, interference from reverberation can be both time varying and dependent upon the preceding source audio signal, possibly making such noise reduction algorithms less suited to the task of dereverberation.
Existing methods may employ a statistical model for reverberation, allowing the reverberant spectrum to be estimated based on the past audio data. This statistical model corresponds most accurately to the dense reflections encountered in the late reverberant field, as opposed to the sparse, more prominent echoes of early reverberation. Additionally, since spectral subtraction methods may modify only the amplitude spectrum of the signal and may need to use the original phases for reconstruction, they may not be capable of perfect signal reconstruction.
In recent years, attention has been focused on developing speech dereverberation techniques based on parameters of the expected clean audio. In particular, some approaches use a linear prediction model of speech. By processing the recorded sound to remove the filtering of the vocal tract, the remaining residual signal can be viewed as an approximation of the glottal pulse waveform. In clean speech, this waveform can be impulsive in nature, with short duration, high amplitude pulses followed by intervals of low amplitude. However, this amplitude distribution can be smeared in noisy or reverberant speech. Therefore, it is possible to develop a filtering method designed to optimize some statistic of the linear prediction residual. Common examples can include maximizing the skew or kurtosis, or minimizing the associated entropy. While the spectral subtraction methods make assumptions about the nature of the reverberation, the linear predictive modeling methods may instead restrict the model of the source data. As such, these methods depend upon the source being a single speaker, or else they may require that source separation be performed beforehand.
Therefore, what are needed are devices, systems and methods that overcome challenges in the present art, some of which are described above.