The existence of echo is a frequent problem in audio systems. One example of an audio subsystem 10 in which echo arises is shown in FIG. 1. Subsystem 10 might be included, e.g., at one end of a duplex audio (e.g., communication) system. In it, audio signals are both input and output simultaneously. Specifically, a received signal 12, designated as Rx in FIG. 1 (which typically will have been subject to some prior processing, not shown in FIG. 1), is output through a speaker 14. Simultaneously, a microphone 16 inputs a signal 18, a digitized version of which being designated as x(n), also referred to as digital input signal 19, which ultimately is, e.g., transmitted to a recipient, recorded, or used in some other manner.
Unfortunately, it frequently is the case that some portion of the audio signal 12 that is played through speaker 14 reaches microphone 16, typically with some modifications, which are represented in FIG. 1 by discrete-time finite impulse response f(n). Contributions to impulse response f(n) might come, e.g., from characteristics of the speaker 14, sound-reflective and/or sound-absorptive surfaces within the same space as speaker 14 and microphone 16, and/or characteristics of the air between speaker 14 and microphone 16.
In order to address this issue, the signal x(n) 19 conventionally is processed by a digital echo canceler 20, which attempts to remove the echo noise. For this purpose, in the current disclosure: r(n) is used to denote the echo reference signal 22 (which typically is a digitized version of the received signal 12 that is provided to the speaker 14), x(n) 18 (as noted above) is a digitized version of the signal received by microphone 16, and y(n) is the echo cancellation (EC) digital output signal 24. Conventionally, all three of such signals are at the same sampling rate R, and the relationship between x(n) and r(n) is:x(n)=r(n)*f(n)+d(n)where * denotes the convolution operation and d(n) is a digitized version of the near-end target signal (i.e., a digitized version of the microphone input signal 18 that would be present in the absence of echo noise). Ideally, echo canceler 20 outputs y(n)=d(n). For this purpose, an estimate of the impulse response f(n), i.e., {circumflex over (f)}(n), n=0, . . . , L−1 (where L is the chosen echo reference length), typically is generated. In conventional EC algorithms, Least-Mean-Square (LMS) or Normalized-Least-Mean-Square (NLMS) algorithms are used to continuously update the impulse response estimate, {circumflex over (f)}(n), at each of the time samples at the original sampling rate R. Then, in certain conventional subsystems 10, the echo canceler 20 is implemented such that:
                              y          ⁡                      (            n            )                          =                                            x              ⁡                              (                n                )                                      -                                          r                ⁡                                  (                  n                  )                                            *                                                f                  ^                                ⁡                                  (                  n                  )                                                              =                                    x              ⁡                              (                n                )                                      -                                          ∑                                  τ                  =                  0                                                  L                  -                  1                                            ⁢                                                                    f                    ^                                    ⁡                                      (                    τ                    )                                                  ⁢                                  r                  ⁡                                      (                                          n                      -                      τ                                        )                                                                                                          Eq        .                                  ⁢        1            Such systems can be considered to employ a full-band EC algorithm.
Alternatively, as shown in FIG. 2, a conventional sub-band EC system 20 decomposes 30 the full-band input signals into M equally divided sub-bands. Such sub-band input signals can be denoted as xm(n) and rm(n) for m=1, . . . , M. Conventionally, these band-passed sub-band signals have the same sampling rate R as the original input signals. Those sub-band signals are then down-sampled 32 by a factor of D, mainly for the purpose of reducing the data rate and thereby reducing computational complexity.
The down-sampled signals, which can be denoted as xmD(n) and rmD(n) for m=1, . . . , M, respectively, now at the sampling rate R/D, are then fed into the corresponding sub-band's echo cancellation module 34m, labeled EC-m in FIG. 2 and sometimes referred to as such in this disclosure. Each such echo cancellation module 34m also processes at the sampling rate R/D and, hence, uses much less computational resources than if it were running at the original sampling rate R. Otherwise, the echo cancellation modules 34m also implement Equation 1 above. The output, ymD, of each echo cancellation module 34m is then up-sampled 36 by a factor of D. Finally, all such up-sampled sub-band output signals ym are resynthesized 40 into a full-band output signal 42 (i.e., y(n)).
In certain conventional sub-band implementations, to further save on computational resources, the down-sampling operations 32 are combined into the decomposition module 30, and the up-sampling operations 36 are combined into the resynthesis module 40. However, for either such implementation, it has been widely reported that increased down-sampling, while resulting in less computational complexity, also diminishes echo-reduction performance.
Conventional sub-band echo cancellation systems typically have faster convergence and better steady-state echo suppression performance than full-band systems. However, such improvements over traditional full-band echo cancellation are provided at the cost of a significant increase in computational (or system) complexity.