1. Field of the Invention
The present invention relates to a noise canceling method and a noise canceling unit, and more particularly to a noise canceling method and a noise canceling unit which use an adaptive filter to cancel background noises introduced into sound signals entered from a microphone or a handset.
2. Description of the Related Art
Background noise signals, introduced into sound signals entered from a microphone or a handset, create a serious problem in a highly-compressed narrow band audio coding unit or a speech recognition unit. As a noise canceling unit which cancels such acoustically-superimposed noise components, a two-input noise canceling unit using an adaptive filter is described in "Adaptive Noise Canceling: Principles and Applications" by B. Widrow et. al., Proceedings of IEEE, Vol. 63, No. 12, 1975, pp. 1692-1716 (hereinafter called Reference 1).
This two-input noise canceling unit uses an adaptive filter which closely approximates the impulse response of the noise path, from the reference input terminal to the speech input terminal, through which noise signals entered from the reference input terminal travel. This adaptive filter generates pseudo noise signals corresponding to the noise signal components mixed into the speech input terminal and then subtracts the pseudo noise signals from the signals received from the speech input terminal (combination of speech signals and noise signals), thus suppressing noise signals.
In this configuration, the coefficients of the adaptive filter are modified by the correlation between the error signal produced by subtracting the pseudo noise signal from the received signal (combination of speech signals and noise signals) and the reference signal entered from the reference input terminal. Some of the known adaptive filter coefficient modification methods, or convergence algorithms, include "LMS Algorithm" described in Reference 1 and "Learning Identification Method: LIM)" described in "IEEE Transactions on Automatic Control", Vol. 12, Number 3, 1967, pp. 282-287 (hereinafter called Reference 2).
FIG. 3 is a block diagram showing an example of a conventional noise canceling unit. A speech is picked up and converted to an electric signal, for example, by a microphone placed near the speaker. This speech signal, received at a speech input terminal 1, includes a background noise. On the other hand, the signal, picked up by a microphone located away from the speaker and then converted to an electric signal, corresponds to the background noise signal. This noise signal is received at a reference terminal 2.
The signal received at the speech input terminal 1 (hereinafter called the received signal) is composed of the speech signal and the background noise as described above. This signal is then supplied to a delay circuit 3. The delay circuit 3 adds the delay amount of .DELTA. t1 (delay time) to the received signal which is then sent to a subtracter 5. The delay circuit 3, inserted to satisfy the law of causality, normally has a delay amount of approximately the half of the number of taps of an adaptive filter 4. On the other hand, the noise signal, entered into the reference terminal 2, is supplied to the adaptive filter 4 as the reference noise signal. Upon receiving the reference noise signal, the adaptive filter 4 generates a pseudo noise signal through filtering and then supplies it to the subtracter 5.
The subtracter 5 subtracts the pseudo noise signal generated by the adaptive filter 4 from the received signal delayed by the delay circuit 3 to cancel the background noise signal included in the received signal. The subtracter 5 then outputs the received signal to an output terminal 6 and, at the same time, supplies it to the adaptive filter 4 as the error signal.
The adaptive filter 4 serially updates the filter coefficients based on the following three: reference noise signal supplied from the reference terminal 2, the error signal supplied from the subtracter 5, and the step size .alpha. set up for coefficient updating. The "LMS algorithm" described in Reference 1 and the "LIM" described in Reference 2 are used as the filter coefficient update algorithm.
Let the speech signal component of the received signal sent from the speech input terminal 1 be s(k) (where, k is an index representing time), let the noise signal component to be canceled be n(k), and let the delay amount .DELTA.t of the delay circuit 3 be zero. Then, the received signal y(k) supplied from the speech input terminal 1 to the subtracter 5 is represented by the following expression: EQU y(k)=s(k)+n(k) (1)
The adaptive filter 4 receives the reference noise signal x(k) from the reference terminal 2 and generates the pseudo noise signal r(k) corresponding to the noise signal component n(k) used in expression (1). The subtracter 5 subtracts the pseudo noise signal r(k) from the received signal y(k) to output the error signal e(k). Assuming that, as compared with the speech signal component s(k), the additive noise component is small enough to be ignored, the error signal is represented by the following expression:
e(k)=s(k)+n(k)-r(k) (2)
The following describes how the coefficients of the adaptive filter 4 are updated using the "LMS algorithm" described in Reference 1. Let the j-th coefficient of the adaptive filter 4 at time k be wj(k). Then, the pseudo noise signal r(k) output by the adaptive filter 4 is represented by expression (3), where N is the number of taps of the adaptive filter 4. ##EQU1##
Applying the pseudo noise signal r(k), calculated by expression (3), to expression (2) gives the error signal e(k). With the use of the obtained error signal e(k), the filter coefficient wj(k+1) at time (k+1) is calculated by the following expression: EQU wj(k+1)=wj(k)+.alpha..multidot.e(k).multidot.x(k-j) (4)
In expression (4), .alpha., a constant called a step size, is a parameter determining the coefficient convergence time and the residual error amount after convergence.
On the other hand, LIM, the filter coefficient update method described in Reference 2, is calculated by expression (5). ##EQU2##
In expression (5), .mu. is the step size for LIM. LIM performs convergence more reliably than the LMS algorithm by making the step size inversely proportional to the average power of the reference noise signal x(k) entered into the adaptive filter.
When the step size value, that is, .alpha. for the LMS algorithm or .mu. for LIM, is large, the amount of coefficient modification becomes large and therefore the convergence becomes faster. However, the components interfering with coefficient updating, if present, have strong influence, increasing the residual error amount. Conversely, when the step size value is small, the convergence takes long with a smaller interfering signal component influence and a smaller residual error amount. This means that there is a tradeoff between the "convergence time" and the "residual error" in setting up the step size.
The object of the adaptive filter 4 of the noise canceling unit is to generate the pseudo signal component r(k) corresponding to the noise signal n(k). Thus, the difference between n(k) and r(k), that is, the residual error (n(k)-r(k)), is required for use as the error signal for adaptive filter coefficient updating. However, as shown in expression (2), the error signal e(k) includes the speech signal component s(k) and this speech signal component s(k), which acts as the interfering signal component, has strong influence on the coefficient update operation of the adaptive filter 4.
To reduce the influence of the speech signal component s(k) which acts as the interfering signal to the adaptive filter 4, it is necessary to set an extremely small step size value for the coefficient updating of the adaptive filter 4 used in the noise canceling unit. However, the problem is that a small step size value delays the convergence of the adaptive filter 4 as described above.
To solve this problem, the "noise canceling method and noise canceling unit (Japanese Patent Laid-Open Publication No. Hei 10-3298)" is proposed. The method disclosed in the publication uses a second adaptive filter to estimate the signal-to-noise power ratio of the received signal and, based on the estimated ratio value, controls the step size of the first adaptive filter to increase the conversion and to reduce the residual error.
FIG. 2 is a block diagram showing the conventional method described in Japanese Patent Laid-Open Publication No. Hei 10-3298. As shown in FIG. 2, the conventional method comprises a delay circuit 8, a delay circuit 9, a signal-to-noise power ratio estimation circuit 10, a delay circuit 17, a comparison circuit 18, and a step size output circuit 19 to control the step size of the adaptive filter 4.
The signal-to-noise power ratio estimation circuit 10 comprises a delay circuit 11 receiving the received signal y(k) from the speech input terminal 1, an adaptive filter 12 receiving the reference noise signal x(k) from the reference terminal 2, a subtracter 13 subtracting the pseudo noise signal rl(k) output by the adaptive filter 12 from the signal delayed by the delay circuit 11, power average circuits 14 and 15 averaging the powers of the signals output by the subtracter 13 and the adaptive filter 12, respectively, and a division circuit 16 dividing the signal output from the power average circuit 14 by the signal output from the power average circuit 15.
First, the operation of the signal-to-noise power ratio estimation circuit 10 is described. The adaptive filter 12 receives the reference noise signal x(k) from the reference terminal 2, receives the output error signal from the subtracter 13, and outputs the pseudo noise signal. The delay circuit 11, which delays the received signal y(k) for the delay amount of .DELTA. t1, is inserted to compensate for the law of causality as with the delay circuit 3. The subtracter 13 subtracts the pseudo noise signal output by the adaptive filter 12 from the signal delayed by the delay circuit 11 and sends the subtraction result to the adaptive filter 12 as the reference signal.
To increase the convergence speed, a larger value is assigned to the step size for updating the coefficients of the adaptive filter 12. For example, when LIM described in Reference 2 is used as the coefficient update algorithm, a value ranging from 0.2 to 0.5 is used as the step size .mu..
Now, let the received signal be y(k), let the reference noise signal entered into the adaptive filter 12 be x(k), let the pseudo noise signal output from the adaptive filter 12 be r.sub.1 (k), and let the delay amount .DELTA.t.sub.1 of the delay circuit 11 be zero as in the conventional method. Then, the error signal e.sub.1 (k) output from the subtracter 13 is represented by the following expression: EQU e.sub.1 (k)=y(k)-r.sub.1 (k) (6)
Because the received signal y(k) is represented by the sum of the speech signal s(k) and the noise signal n(k) as in expression (1), expressions (6) and (7) are written as follows: EQU e.sub.1 (k)=s(k)+n(k)-r.sub.1 (k) (7)
The error signal e.sub.1 (k) output from the subtracter 13 is supplied to the adaptive filter 12 as the error signal for coefficient updating and, at the same time, to the power average circuit 14. The power average circuit 14 squares the error signal e.sub.1 (k) and outputs the time average.
The square e.sub.1.sup.2 (k) of the error signal e.sub.1 (k) is given by expression (8): EQU e.sub.1.sup.2 (k)={s(k)+n(k)-r.sub.1 (k)}.sup.2 (8)
The power average circuit 14 time-averages this square values e.sub.1.sup.2 (k) To approximate this value with an expected value, the expected value E.sub.1.sup.2 [(k)] is represented by the expression given below. This is because the speech signal s(k)and reference noise signal x(k) are independent of each other and, therefore, the speech signal s(k) and the noise signal n(k) are independent of each other: EQU E[e.sub.1.sup.2 (k)]=E[s.sup.2 (k)]+E[{n(k)-r.sub.1 (k)}.sup.2 ] (9)
The second term of the right-hand side of expression (9) represents the residual error component. A larger step size, if used to speed up the convergence, rapidly attenuates this residual error component, resulting in the following expression:
[Expression 3] EQU E [e.sub.1.sup.2 (k)].apprxeq.E[s.sup.2 (k)] (10)
Therefore, as shown in expression (10), the output from the power average circuit 14 approximates the speech signal power s.sup.2 (k).
On the other hand, the power average circuit 15 squares the pseudo noise signal r.sub.1 (k) output from the adaptive filter 12 and time-averages the result. A larger step size value, when set in the adaptive filter 12, increases the convergence speed. Therefore, the following expression is obtained:
[Expression 4] EQU r.sub.1 (k).apprxeq.n(k) (11)
Therefore, the expected value E[r.sub.1 2(k)] of the squared value r.sub.1 2(k) of the pseudo noise signal r.sub.1 (k) may be approximated by expression (12):
[Expression 5] EQU E[r.sub.1.sup.2 (k)].apprxeq.E [n.sup.2 (k)] (12)
Therefore, the signal output from the power average circuit 15 approximates the noise signal power n.sup.2 (k). The division circuit 16 divides the signal output from the power average circuit 14 by the signal output from the power average circuit 15 and, as a result, outputs the estimated value SNR1 of the signal-to-noise power ratio.
If the operation of the power average circuits 14 and 15 is performed, for example, by calculating the moving averages, a delay .DELTA..sub.av with respect to the actual power times the average is calculated. Thus, to compensate for this variations is generated. The delay depends on the number of delay of .DELTA..sub.AV in this embodiment, the delay circuit 9 giving the delay of .DELTA.t2 to the input reference noise signal of the adaptive filter 4 is provided on the input side of the adaptive filter 4 and, at the same time, the delay circuit 8 giving the delay of .DELTA.t2 to the received signal is provided on the input side of the delay circuit 3. Note that the delay of .DELTA.t2 is normally set to a value equal to or larger than the delay amount of .DELTA..sub.AV. The value of .DELTA.t2, if set to a value larger than .DELTA..sub.AV, would cause a change in SNR1 to be detected earlier than the SNR value of the actual input received signal of the subtracter 5. This means an extension of SNR1 in the negative direction in terms of time. The delay circuit 8 and the delay circuit 3 may be configured as a single delay circuit giving the delay of (.DELTA.t2+.DELTA.t1).
As described above, the signal-to-noise power ratio estimation circuit 10 receives the received signal from the speech input terminal 1 as well as the reference noise signal from the reference terminal 2 to cause the adaptive filter 12 outputting the pseudo noise signal to operate. The signal-to-noise power ratio estimation circuit 10 detects the error signal power and the pseudo noise signal power from the pseudo noise signal sent from the adaptive filter 12 and, based on these powers, outputs the estimated value SNR1 of the signal-to-noise power ratio.
Next, the operation of the delay circuits 8, 9, and 17 and that of the comparison circuit 18 are described. The delay circuit 17 gives the delay of .DELTA.t3 to the estimated signal-to-noise power ratio value SNR1 output from the signal-to-noise power ratio estimation circuit 10. The comparison circuit 18 compares the estimated signal-to-noise power ratio value SNR1 entered into the delay circuit 17 with the estimated signal-to-noise power ratio value SNR2 delayed by the delay circuit 17 and outputs the larger of the two as the estimated value SNR3. The estimated signal-to-noise power ratio SNR3 is a value extended in the positive direction by .DELTA.t3 in terms of time.
Next, the operation of the step size output circuit 19 is described. The step size output circuit 19 receives the estimated value SNR3 of the extended signal-to-noise power ratio output from the comparison circuit 18 and outputs a value corresponding to the received value SNR3 to the adaptive filter 4 as its step size. At this time, when the SNR3 value is large, the step size output circuit 19 outputs a small step size; conversely, when the SNR3 value is small, the step size output circuit 19 outputs a large step size. More specifically, let the SNR3 value at time k be SNR3(k) and let the step size at time k be .mu.(k). Then, the relation between SNR3(k) and .mu.(k) is represented, for example, by expression (13) as follows: EQU .mu.(k)=clip[A.multidot.1/SNR3(k),.mu.max,.mu.min] (13)
where, A is a constant ranging in value from approximately 0.1 to 0.5. clip[a, b, c] is the relation defined as follows to set up the minimum and maximum.
[Expression 6] EQU clip[a, b, c]=a (c.ltoreq.a.ltoreq.b) (14a)
clip[a, b, c]=b (a&gt;b) (14b) EQU clip[a, b, c]=c (a&lt;c) (14c)
Suppose that A=0.1, .mu.max=0.5, and .mu.min=0.01. Then, expression (13) is represented as expression (15) as follows: EQU .mu.(k)=clip[0.1/SNR3(k),0.5, 0.01] (15)
Thus, when the SNR3 value is 0 dB, that is, when SNR3(k)=1, the step size is 0.1 from expression (14a) When the SNR3 value is 10 dB, that is, when SNR3(k)=10, the step size is 0.01 from expression (14a). However, when the SNR3 value is -10 dB, that is, when SNR3(k)=0.1, the step size is limited by the maximum and is set to 0.5 from expression (14b). Similarly, when the SNR3 value is 20 dB, that is, when SNR3(k)=100, the step size is limited by the minimum and is set to 0.01 from expression (14c). The limitation range of the step size like this is effective for the reliable operation of the adaptive filter.
As described above, the step size output circuit 19 controls the step size of the adaptive filter 4 according to the estimated signal-to-noise power ratio value SNR3. This estimated signal-to-noise power ratio value SNR3 is obtained by extending the estimated value SNR1, output from the signal-to-noise power ratio estimation circuit 10, through the delay circuit 17 and comparison circuit 18.
The conventional method described above controls the step size of the adaptive filter 4 with the use of estimated SNR3 value. This configuration increases the step size in a range where no speech signal is present or where the speech signal, if present, is extremely small as compared with the noise signal, thus speeding the convergence with no influence of the interfering signal. On the other hand, in a range where the speech signal component is large as compared with the noise signal, this configuration decreases the step size to prevent the residual error from increasing. At the same time, the SNR3 value used for step size control may be extended in the negative direction by the delay circuit 8 and the delay circuit 9, and in the positive direction by the delay circuit 17,in terms of time. This capability makes it possible to decrease the step size sufficiently before the speech signal starts and to increase the step size after the speech signal ends, allowing the coefficients of the adaptive filter 4 to be converged reliably.
In the noise canceling unit described above, the step size of the adaptive filter 12 is fixed. When "LIM" described in Reference 2 is used as the coefficient update algorithm, the step size .mu. is set to a fixed value ranging, for example, from 0.2 to 0.5. To speed up the convergence of the adaptive filter 12, the setting of the step size .mu. should be as large as possible. However, the setting, when too large, results in a large residual error, decreasing the estimated SNR precision and, as a result, increasing the distortion of the canceling unit. Because of this, a large setting cannot always be used. When the assumed SNR range is fixed in a range, a large step size may be used within the range in which the SNR precision is not decreased. However, when the assumed SNR range is large, the SNR value cannot be estimated precisely for a predetermined step size value which is fixed.