1. Field of the Invention
The present invention relates to echo cancelers and, more particularly, to a multi-channel echo canceler for canceling multi-channel echo, which is generated as a result of the propagation of a plurality of received signals through a spatial acoustic path, from a transmitted signal.
2. Detailed Description of the Invention
In conversation systems involving a plurality of received signals and a single transmitted signal or a plurality of transmitted signals, regarding the method or apparatus for multi-channel echo canceling, i.e., canceling of echo which is generated as a result of the propagation of a received signal through a spatial acoustic path, a cascade connection type and a linear combination type are proposed in the Technical Report of Institute of Electronics, Information and Communication Engineers of Japan, Vol. 84, No. 330, pp. 7-4, CS-84-178 (hereinafter referred to as Literature No. 1), and also a multi-channel echo canceler, with a single adaptive filter per channel, is proposed in Proceedings of the 1991 Institute of Electronics, Information and Communication Engineers, Spring Conference, Vol. 1, pp. 202, A-202 (hereinafter referred to as Literature No. 2). However, In the Proceedings of the 6-th Digital Signal Processing Symposium, pp. 144-149, A5-3 (hereinafter referred to as Literature No. 3), it is pointed out that the cascade connection type and linear combination type lead to large. hardware size because the hardware size is proportional to the square of the number of channels, that the convergence of the adaptive filter is retarded when there is strong cross-correlation among received signals, and that the adaptive filter coefficients may fail to converge to the optimum value. Further, in Proceedings of the 1992 Institute of Electronics, Information and Communication Engineers, Spring Conference , Vol. 1, pp. 158, A-158 (hereinafter referred to as Literature No. 4), it is pointed out that in a multi-channel echo canceler with a single adaptive filter per channel, it takes a long time from the instant of movement or change of the talker till the re-convergence of the filter coefficients to the optimum value and that during this time the echo cancellation performance is deteriorated. To solve this problem, in the Literature No. 4 a compact multi-channel echo canceler is proposed, which can fast track the movement or change of the talker. The compact multi-channel echo canceler proposed in the Literature No. 4 will now be described in connection with its application to a television conference system, in which both the received and transmitted signals are of two channels.
FIG. 16 is a block diagram showing an audio part of a conventional 2-channel television conference system connecting two television conference rooms 30 and 31. Here, acoustic echo cancellation in the first television conference room 30 will be considered.
It is assumed that a second and a third talker 18 and 19 are present in the second television conference room 31.
Speeches 20 and 22 from the respective second and third talkers 18 and 19 are led through the spatial acoustic path so as to be inputted in a third microphone 24 and supplied to a second echo canceler 130.sub.2. The speeches inputted in the third microphone 24 are transmitted as a first received signal 1 to the first television conference room 30. Likewise, speeches 21 and 23 generated from the respective second and third talkers 18 and 19 are led through the spatial acoustic path so as to be inputted in a fourth microphone 25 and supplied to the second echo canceler unit 130.sub.2. The speeches inputted in the fourth microphone 25 are transmitted as a second received signal 2 to the first television conference room 30.
In the first television conference room 30, a first echo 5, which is generated as the first received signal 1 is reproduced by a first loudspeaker 3 and led through the spatial acoustic path to a first microphone 9, a second echo 6, which is generated as the second received signal 2 is reproduced by a second loudspeaker 4 and led through the spatial acoustic path to the first microphone 9, and a first transmitted signal 12, which is the speech of a first talker 11 reaching the first microphone 9, are added together to form a first mixed signal 14. Likewise, a third echo 7, which is generated as the first received signal 1 is reproduced by the first loudspeaker 3 and led through the spatial acoustic path to a second microphone 10, a fourth echo 8, which is generated as the second received signal 2 is reproduced by the second loudspeaker 4 and led through the spatial acoustic path to the second microphone 10, and a second transmitted signal 13, which is the speech of the first talker 11 reaching the second microphone 10, are added together to form a second mixed signal 15. For the canceling of the echoes 5 to 8 contained in the first and second mixed signals 14 and 15, a first echo canceler unit 130.sub.1 is used.
A delay time difference estimation circuit 101 receives the first and second received signals 1 and 2 as input signals and estimates the delay time difference between the two received signals, the result of estimation being supplied to a received signal selection circuit 102 and a filter coefficient set selection circuit 104. The received signal selection circuit 102 detects the received signal having a shorter delay time from the two received signals 1 and 2 according to the result of estimation in the delay time difference estimation circuit 101, the result of detection being supplied to a selector 103. The selector 103 receives the first and second received signals 1 and 2 as input signals and selectively supplies the received signal having the shorter delay time from the two received signals 1 and 2 to a first and a second adaptive filter 122 and 123 according to the result of detection in the received signal selection circuit 102. The filter coefficient set selection circuit 104 selects a set of filter coefficients among a plurality of preliminarily prepared sets of filter coefficients used in the first and second adaptive filters 122 and 123, the result of selection being supplied to the first and second adaptive filters 122 and 123 according to the result of estimation in the delay time difference estimation circuit 101.
The first adaptive filter 122 receives the received signal selected by the selector 103 as an input signal and generates an echo replica corresponding to the echo contained in the first mixed signal 14 by using the filter coefficient selected by the filter coefficient selection circuit 104, the generated echo replica being supplied to a first subtracter 107. The first subtracter 107 subtracts the echo replica as the output of the first adaptive filter 122 from the first mixed signal 14 to produce a first output signal 16. The first adaptive filter 122 is controlled such as to minimize the first output signal 16.
The second adaptive filter 123 receives the received signal selected by the selector 103 as an input signal and generates an echo replica corresponding to the echo contained in the second mixed signal 15 by using the filter coefficient selected by the filter coefficient selection circuit 104, the generated echo replica being supplied to a second subtracter 108. The second subtracter 108 subtracts the echo replica as the output of the second adaptive filter 123 from the second mixed signal 15 to produce a second output signal 17. The second adaptive filter 123 is controlled such as to minimize the second output signal 17.
The delay time difference estimation circuit 101 estimates the delay time difference between the first and second received signals 1 and 2 by using a cross-correlation function between the first and second received signals 1 and 2. Denoting the first and second signals 1 and 2 at instant n by x.sub.1 (n) and x.sub.2 (n), respectively, the cross-correlation function R.sub.12 (n, m) at the instant n corresponding to the delay time difference m is defined as: EQU R.sub.12 (n, m)=E[x.sub.1 (n)x.sub.2 (n+m)] (1)
E[.multidot.] is the ensemble average of .multidot.. It is difficult, however, to calculate the ensemble average as defined. Usually, therefore, it is approximated by a time average. For example, using the first order recursive integral it is calculated as: EQU R.sub.12 (n, m)=(1-.alpha.)x.sub.1 (n)x.sub.2 (n+m)+.alpha.R.sub.12 (n-1,m)(2)
where .alpha. is a constant given as EQU 0&lt;.alpha.&lt;1 (3)
By increasing .alpha., the integration period is increased to increase the accuracy of the delay time difference estimation. However, the tracking speed to the movement or change of the talker is reduced. By reducing .alpha., on the other hand, the integration period is reduced to increase the tracking speed to the movement or change of the talker. In this case, the accuracy of the delay time difference estimation is reduced.
In other words, increasing .alpha. for increasing the accuracy of the delay time difference estimation results in delay of detection of the movement or change of the talker. During the period from the movement or change of the talker till the actual detection of such movement or change, an erroneous set of filter coefficients is selected, thus increasing the residual echo so as to increase the amount of filter coefficient update. Such erroneous filter coefficient updating results in the production of a filter coefficient set having a great coefficient error. If such a filter coefficient set with great coefficient error is selected again, after it is recognized that the talker has moved or changed, the performance of echo cancellation is deteriorated.
On the other hand, reducing a for increasing the tracking speed to the movement or change of the talker results in reduction of the accuracy of the delay time difference estimation. In this case, the estimated delay time difference is changed frequently so as to bring about frequent filter coefficient switching, thus deteriorating the performance of echo cancellation.
As shown, the prior art method and apparatus for multi-channel echo cancellation as described above, pose problems such that increasing the accuracy of the delay time difference estimation results in a delay in the detection of the movement or change of the talker so as to increase the filter coefficient error in the adaptive filters, while increasing the tracking speed to the movement or change of the talker results in reduction of the accuracy of the delay time difference estimation so as to bring about frequent filter coefficient switching, thus deteriorating the performance of echo cancellation.