This invention relates to a noise canceler to be installed in a telecommunications apparatus designed to encode voice signals for transmission such as a digital cordless telephone set or a digital wired telephone set that is suitably used with a digital portable telephone system or a PCS (personal communication service) system.
A low bit rate voice coding scheme such as the code excited linear prediction (CELP) scheme is popularly used for digital portable telephone sets. With such a coding scheme, voices spoken in an environment with a high background noise level can be clearly heard. The CELP scheme is discussed in detail in M. R. Schroeder and B. S. Atal: "Code-Excided Linear Prediction (CELP): High-Quality Speech at Very Low Bit Rates", in Proc. ICASSP, 1985, pp. 937-939.
However, spoken voices are remarkably blurred in environments with a high background noise level including buses and commuter trains. Efforts have been made to develop noise cancelers that eliminate noises and encode only voices. Known papers discussing noise cancelers include "Suppression of Acoustic Noise in Speech Using Subtraction" (IEEE rans., vol. ASSP-27, pp. 113-120, April 1979).
The technology discussed in this paper can be summarized as follows. An observed signal is firstly divided into frames with 256 samples and an orthogonal transform operation such as fast Fourier transform is conducted on each of the frames to analyze the frequencies of the signal. Meanwhile, the magnitudes u(i) of the Fourier transform coefficients i of the noise components are observed in advance so that the transform coefficients (i) of the frames of the observed signal may be suppressed by means of the formula below. EQU S (i)=max(0, .parallel.s(i).parallel.-u(i))* sign (s(i))
Then, the suppressed transform coefficients S (i) are subjected to an inverse fast Fourier transform (IFFT) to recover the signal, which is subsequently sent to a voice coding section. In this way, the power corresponding to the noise is subtracted from the transform coefficients so that it is theoretically possible to eliminate the noise component from the observed signal and recover the voice.
A constancy is assumed in the transform span for a frequency analysis using orthogonal transform such as FFT. However, the voice is not constant within a frame, nor is the noise if viewed as individual transform coefficient. Both voice and noise can fluctuate with time. Thus, with a known noise canceler that assumes a constancy in the transform span, part of the noise component may remain and/or part of the voice frequencies may be lost or damaged in the noise eliminating operation. These problems then appear as a noise having a specific frequency that can be more annoying than the original sound prior to the noise eliminating operation to baffle the effort for noise suppression.
Generally, an FFT with 256 dimensions is used for a noise canceler. However, an FFT with 256 dimensions involves a large volume of arithmetic operations and hence is not feasibly applicable to small telecommunications apparatus such as portable telephone sets and, therefore, an FFT with dimensions as low as 128, 64 or 32 may have to be used for noise cancelers to be used in small telecommunications apparatus including portable telephone sets. However, an FFT with reduced dimensions is accompanied by a drawback of a long frame length that is longer than the pitch cycles of voice. If a noise eliminating operation is conducted with such a long frame length, assuming a constancy in the transform span as described above, the pitch cycles of voice can be distorted due to the suppressed transform coefficients in a lower frequency band to reproduce a queerly sounding speech if the noise is suppressed effectively.