As one of solutions for suppressing a noise component included in a captured voice signal, there is the spectral subtraction method. That is also called the frequency subtraction method, which subtracts a noise spectrum from the spectrum of a voice signal containing noise.
However, the spectral subtraction is effective at suppressing a noise component, but may cause an allophone component, i.e. musical noise, a sort of tonal noise.
Shinya OGATA, et al., “Iterative Spectral Subtraction Method for Reduction of Musical Noise”, Proceedings of the Meeting of the Acoustical Society of Japan, pages 387-388, March 2001, discloses that a signal, whose noise component is suppressed by spectral subtraction, is subjected again to the spectral subtraction in such a manner that an iteration process is repeated a certain number of times, e.g. ten, to suppress the generated noise including musical noise.
According to the conventional iterative spectral subtraction, particularly when directivity is formed to estimate noise, an estimated noise component may be subtracted excessively. If the arrival bearing of voice of someone other than a target speaker, namely disturbing sound, corresponds to a direction according to the formed directivity, the precision of the estimated noise is so high that a single subtraction can produce significant suppression effect. In such a case, if the times of iteration are fixed, the subtraction may be performed more than necessary because of too many iterations although fewer times of iteration suffice, whereby a target vocal component may also be suppressed, causing sound distortion.
By contrast, if the arrival bearing of a disturbing sound is off the direction according to the formulated directivity, the precision of the estimated noise component is so low that the suppression effect brought by the single subtraction is small, and it is therefore preferable to conduct the iteration a larger number of times. However, if the times of iteration are fixed, actual times of iteration will be fewer than a required number of times, and as a consequence the capability to suppress the noise component will be insufficient although the target voice is less affected.
In this way, the iterative spectral subtraction method has the drawbacks that the vocal component may become distorted and loses its naturalness each time the iteration is repeated, and that the optimal times of iteration may vary depending on the arrival bearing of disturbing sound.