When a sound is emitted in a closed space, it is usually distorted from reverberation. This degradation is detrimental to sound quality and to speech intelligibility and it significantly degrades the performance of Automatic Speech Recognition (ASR) systems. Reverberation is also harmful for most speech-related applications, such as automatic speaker recognition, automatic emotion recognition, speech detection, speech separation, pitch tracking, speech segregation, etc. In addition, reverberation degrades the quality of music signals and decreases the performance of music-related tasks such as music signal classification, automatic music transcription, analysis and melody detection, source separation, etc. Therefore, there is a great need for dereverberation methods and systems.
In room acoustics, room reverberation can be considered as the combination of early reverberation (alternatively called early reflections) and late reverberation. The early reflections arrive right after the direct sound and they mainly result to a spectral degradation which is perceived as coloration. The early reflections are not considered harmful for speech intelligibility, ASR or any other signal-processing task, however they can typically alter the signal's timbre. Late reverberation arrives after the early reverberation and produces a noise-like effect, generated by the signal's reverberant tails. Late reverberation is detrimental for the signal's quality, the intelligibility of speech and it severely degrades the performance of signal processing algorithms. In addition, late reverberation is also responsible for a severe degradation of speech intelligibility in hearing impaired listeners, even when they use hearing assistive devices such as hearing aids or cochlear implants.
In signal processing, when assuming a Linear and Time Invariant system, deconvolution can be typically applied in order to suppress a convolutive distortion. Since reverberation is a convolutive distortion, deconvolution is the ideal way of confronting the reverberation problem. FIG. 1 shows a schematic illustration of the ideal dereverberation via deconvolution. The anechoic signal x(n) 102 (n is the time index) is reproduced in a closed space and it is distorted from room reverberation 104. The reverberation distortion can be mathematically expressed as time-domain convolution of the anechoic signal with the Room Impulse Response (RIR) h(n). Therefore, the reverberant signal y(n) can be obtained as:y(n)=x(n)*h(n)  (1)where * denotes time-domain convolution. In theory, the RIR h(n) can be blindly estimated from the reverberant signal or acoustically measured via an appropriate technique 106. This estimation or measurement of the RIR can be used to deconvolve the reverberant signal from the RIR D(y(n)) 108 and to obtain an estimation of the clean signal {circumflex over (x)}(n) 110. When the RIR is exactly known, the estimation {circumflex over (x)}(n) is equal to the anechoic signal x(n). So in theory, an ideal inversion (deconvolution) of the Room Impulse Response (RIR) will completely remove the effect of both early reflections and late reverberation. However, there are several problems with this ideal approach. First of all, typical RIRs have thousands of coefficients and an exact blind estimation is practically impossible. Moreover, the RIR is known to have non-minimum phase characteristics, the inverse filters are to a large extent non-causal and exact measurements of the RIR must be available for the specific source/receiver room positions. When the sound source is moving, the RIR constantly changes and accurate measurements are impossible. Hence, for real-life applications RIR measurements are not available and other blind dereverberation options that do not try to accurately estimate the RIR or use any prior information of the acoustic channel are needed.
Blind dereverberation (i.e. dereverberation without any other prior knowledge other than the reverberant signal) is a difficult task and it produces signal processing artifacts. Hence, the produced output signal is often of insufficient quality. Despite engineering efforts, the dereverberated signals often fail to improve signal quality and speech intelligibility. In many cases, blind dereverberation methods produce artifacts that are more harmful than the original reverberation distortion. Accordingly, a need exists to overcome the above mentioned drawbacks and to provide a method and a system for significant dereverberation of digital signals without producing processing artifacts.
Typical dereverberation methods confront either the early or the late reverberation problem. In order to tackle reverberation as a whole, early and late reverberation suppression methods have been used sequentially. An early reverberation suppression method is typically used as a first step to reduce the early reflections. Usually, in a second step a late reverberation suppression approach suppresses the signal's reverberant, tail. However, early and late reverberation suppression methods have not been used in parallel. The goal of processing early and late reverberation in parallel, or combining multiple late/early reverberation estimation methods is to provide new artifact-free clean signal estimations.
In addition, the required amount of dereverberation strongly depends on the room acoustic characteristics and the source-receiver position or positions. Dereverberation algorithms should inherently include an estimation of relevant room acoustic characteristics and also estimate the correct suppression rate (e.g the amount and steepness of dereverberation), given that for a moving source or receiver the acoustic environment constantly changes. When the reverberation suppression rate is incorrect, it causes processing artifacts. Therefore, taking into consideration the acoustic environment (e.g. room characteristics such as dimensions and materials, acoustic interferences, source location, receiver location, etc.) there is a need for a method of controlling the reverberation suppression rate, either by a user or automatically.