This application is based upon and claims priority from Korean Patent Application No. 2001-63404 filed Oct. 15, 2001, the contents of which are incorporated herein by reference.
1. Field of the Invention
The present invention relates to a speech signal processing, and more particularly, to an apparatus and a method for computing a Speech Absence Probability (SAP), and an apparatus and a method for removing noise that exists in a speech by using the computation apparatus and method.
2. Description of the Related Art
SAP refers to the probability that speech is absent in a given speech period, and is a basis for determining whether the speech is absent or not in the section. In the section deemed to have no speech, it is considered that only noise exists while in the section deemed to have only noise, variance of the noise is updated. Since the dispersion of the noise has a great influence on the performance of a noise removal device, more accurate computation of the SAP helps to remove the noise effectively.
Speech enhancement refers to the activity of improving the system performance that is, minimizing impact of the noise that deteriorates the system performance when an input signal or an output signal of a speech communication system is contaminated by noise. The speech enhancement is necessary for a human-to-human communication or a human-to-machine communication when a communication channel is influenced by noise, or a receiving end detects noise. Especially, the speech enhancement is required when an input speech signal contaminated by the noise is coded, the performance of the speech recognition system needs to be improved and the quality of speech needs to be improved. Generally, the speech enhancement refers to the activity of assuming a noise-free speech signal in a noise speech environment where a speech absence is uncertain. The concept of using uncertainty of speech absence that exists in each frequency channel of a noise speech spectrum has been applied to enhancement of performance of a speech enhancement system. The concept of using uncertainty of speech absence is disclosed in a thesis on pages 1109–1121 of IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-32, No. 6, which was publicized in 1984 by Yariv Ephraim and David Malah under the title of “Speech Enhancement using a Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator”. According to a conventional method for computing the SAP shown in most studies, the SAP of each frequency channel was computed locally irrespective of other frequency channels. However, the conventional computation method has limit in guaranteeing statistical reliability when speech enhancement is realized because insufficient data is used.
As another solution to the above problem, there is a Global Soft Decision (GSD) disclosed in a thesis on pages 108–110 of IEEE Signal Processing Letters, Vol. 7, which was publicized by N. Kim and J. Chang in 2000, under the title of “Spectral enhancement based on global soft decision”. The conventional GSD proved to be superior to the method used in IS-127 standard. The GSD uses data of all the frequency channels, determines globally whether a given time frame is a speech absence frame or not, and uses sufficient amounts of data. Therefore, the statistical reliability of the GSD can be higher than that of the method for computing the SAP. In addition, since the conventional GSD assumes a noise power spectrum from noise speech in not only the speech absence frame but also speech presence frame unlike the conventional other methods, the SAP can be computed more accurately, and a robust procedure for spectral gain modification and noise spectrum estimation can be provided. One of the conventional GSD methods is disclosed under the title of ‘Speech Enhancement Method’ in Korean Patent No. 99-36115. However, the conventional GSD method is based on an inaccurate assumption that spectrum components of each frequency channel are independent. As a result, the SAP cannot be computed accurately and noise cannot be removed effectively under the noise environment.