In a conventional conferencing system, one or more microphones capture a sound wave at a far end site, and transforms the sound wave into a first audio signal. The first audio signal is transmitted to a near end site, where a television set or an amplifier and loudspeaker reproduces the original sound wave by converting the first audio signal generated at the far end site into the sound wave. The produced sound wave at the near end site is captured partially by the audio capturing system at the near end site, converted to a second audio signal, and transmitted back to the system at the far end site. This problem of having a sound wave captured at one site, transmitted to another site, and then transmitted back to the initial site is referred to as acoustic echo. In its most severe manifestation, the acoustic echo might cause feedback sound when the loop gain exceeds unity. The acoustic echo also causes the participants at both sites to hear themselves, making a conversation over the conferencing system difficult.
The echo problem is further described in reference to FIG. 1. A digital audio signal from far end 1101 is converted into the analog domain by the digital to analog converter (DAC) 1301, amplified in the loudspeaker amplifier 1302 and further converted to acoustic signals by the loudspeaker 1303. Both the direct signal 1304 and reflected versions 1306, reflected by walls/ceilings etc. 1305 are undesirably picked up by the microphone 1308. The microphone also picks up the desired near end signal 1307. The microphone signal is amplified in the microphone amplifier 1309 and digitized in the analog to digital converter 1310, outputting the uncancelled microphone signal 1202.
If the uncancelled microphone signal were transmitted to the far end, the far end site would hear echo of themselves, and if a similar system was present at the far end, even howling/feedback might have occurred.
The common way to solve this problem is to add the acoustic echo canceller 1203 to the microphone signal path. This canceller uses the digital loudspeaker signal as a signal reference, and estimates all of the loudspeaker to microphone paths 1304/1306, and subtracts these estimates from the uncancelled microphone signal 1202, making the cancelled microphone signal 1204, which is transmitted to the far end as signal 1102.
According to prior art there are two main approaches for acoustic echo cancellers. The first one is a full band canceller, and the second one is a sub band canceller. Both of these normally use adaptive FIR (finite impulse response) filters for the echo path estimating, however applying these in full band domain and sub band domain, respectively.
An acoustic echo canceller used will typically include several additional sub blocks e.g. double talk algorithm, non-linear processing unit, comfort noise generation, etc. For simplicity and perspicacity, these sub blocks are not discussed here, as these blocks are not directly relevant to the scope of the invention. These blocks may vary and are well documented in prior art. For one skilled in the art, the integrating of these blocks is straightforward.
FIG. 2 shows a prior art full band acoustic echo canceller. The signal from far end 2101 is passed to the loudspeaker as signal 2102 and is also used as the loudspeaker reference signal 2103.
The loudspeaker reference signal 2103 is filtered through the adaptive FIR filter 2104. This adaptive filter converges to and tracks the impulse response of the room. For the initial convergence, and to adjust for any acoustic changes in the room (door opens, people move, etc.), the adaptive FIR filter 2104 has to be adaptive. Many different adaptive algorithms can be used for this purpose, from the inexpensive (low processing power) LMS (least mean square) to more sophisticated and more expensive algorithms as APA (affine projection algorithm) and RLS (recursive least squares). However, in common, all these algorithms use the FIR filter update loop 2108 for adapting. The adaptive FIR filter outputs an inverted echo estimate 2105, which is added to the uncancelled microphone signal 2106, calculating the echo cancelled microphone signal 2107.
In a full band echo canceller, no algorithmic delay is added to the microphone signal path, and therefore full band cancellers are often used when short delay is a requirement.
However, there are some disadvantages with the prior art full band canceller. One disadvantage is that the adaptive filter's ability to track changes in the acoustic environment is poor/slow, especially for speech and other natural (coloured) signals. Another disadvantage is that the processing power requirements tend to be exhaustive, as explained in the following.
The model of the acoustic system used in most echo cancellers is a FIR filter. FIR filters are well known in the art of signal processing, and the basics of which will not be discussed here. The FIR filter approximates the transfer function of the direct sound and most of the reflections in the room. Due to processing power requirements, the FIR filter will not try to cancel echo in an infinite time after the signal was played on the loudspeaker. Instead, it will accept that the echo after a given time, the so-called tail length, will not be cancelled, but will appear as residual echo.
To estimate the echo in the complete tail length, the required length of the FIR filter will be:L=Fs*tail length,where Fs is the sampling frequency in Hz, and the tail length is given in seconds.
The required number of each of multiplications and additions to calculate one single sample output of the filter equals the filter length, and the output of the filter should be calculated once per sample. Consequently, the total number of multiplications and additions are:Fs*L=Fs*Fs*taillength=taillength*FS2 
A typical value for the tail length is 0.25 sec. The number of multiplications and additions will be 16 Million for a system using a sampling frequency of 8 kHz, 64 Million for 16 kHz and 576 Million for 48 kHz.
Similar calculations can be performed for the filter update algorithm. The simplest algorithm, LMS, has the same number of additions and multiplications as the FIR filter, so for the absolute simplest full band canceller, the number of additions and multiplications each equals:2*taillength*Fs2.
More complex update algorithms improve the tracking ability of the FIR filter, but are even more complex in terms of processing power. There exists algorithms having a complexity proportional to the filter length, but with a proportional constant much higher than the LMS algorithm, and even algorithms with a complexity proportional to the square of the filter length. The last case gives a processing power requirement for a full band echo canceller proportional to Fs*(Fs*taillength)2, which is unrealistic for full band acoustic echo cancellers.
The conventional way of overcoming the two disadvantages of a full band echo canceller discussed above is to introduce sub-band processing. In FIG. 3, one approach to this is shown, which will be discussed in the following.
The signal from the far end 3101 is passed to the loudspeaker as signal 3102. It is also divided into a chosen number sub-bands using the analyze filter 3301. The uncancelled microphone 3106 is divided into sub-bands using another (but equal) analyze filter 3302. The chosen number of sub-bands is hereafter denoted N.
For each sub-band, the loudspeaker analyze filter outputs a sub-band reference signal 3203, which is filtered through a sub-band FIR filter 3204, calculating an inverted sub-band echo estimate 3205. The microphone analyze filter outputs a sub-band uncancelled signal 3206, which is added to the inverted echo estimate, outputting a sub-band echo cancelled microphone signal 3207. The echo cancelled microphone signal is used for adapting the FIR filter, shown as the sub-band FIR filter update loop 3208.
The echo cancelled microphone signals from all sub-bands are also merged together to a full band cancelled microphone signal 3107 by the synthesize filter 3303. Using this approach, the signal is divided into bands with smaller bandwidth, which can be represented using a lower sampling frequency, which will follow from the discussion below. Note that the analyze filter consists of a filter bank and a decimator, while the synthesize filter consists of a filter bank and an interpolator.
According to Nyquist's sampling theorem, the sampling frequency of the full band signal will be calculated as follows:Fsfullband=2*Ffullband where Ffullband is the full band frequency band. Similarly, the sampling frequency of the sub-band signal can be calculated as:Fssubband=2*Fsubband where Fsubband is the sub-band frequency band. Moreover, the frequency band of each sub-band can be expressed as follows:Fsubband=Ffullband/N 
Further, to simplify and reduce the processing power requirements of a filter bank, oversampling is conventionally being used. This can be expressed mathematically by introducing a constant, which of course can include all other constants added in the expression.
From the expressions above, it follows that the sub-band signals will have a sampling frequency of:Fssubband=(K/N)*Fsfullband.where K is the oversampling factor. K is always higher than one, but most often relatively small, typically less than two.
Assuming a FIR filter with an adoption of a complexity proportional to the filter length (for example LMS), the required processing power for the filtering and adoption in one sub-band can be expressed as:PROSPowsubband=C1*taillength*Fssubband2 PROSPowsubband=C1*taillength*(K/N*Fsfullband)2 where C1 is a proportionally constant.
Consequently, for all N sub-bands the required processing power equals:ProsPow=N*C1*taillength*(K/N*Fsfullband)2 ProsPow=C1*taillength*(K*Fsfullband)2/N 
Thus, for a high N, the processing power requirements of the filtering can be drastically reduced. Of course, the overhead of the analyze and synthesize filters must be added, but for high tail lengths and reasonably high N, this overhead is small compared to the savings described.
For more sophisticated update algorithms with complexity proportional to the square of the filter length, the complexity reduction compared to the full band case is even higher, due to the significantly lower filter length.
In addition, experience has shown that sub-band cancellers have an improved ability to adapt to changes in the acoustic environment, especially for speech and other natural (coloured) signals.
However, one major disadvantage is introduced in the sub-band scheme. The analyze and synthesize filters add algorithmic delay to the microphone signal. In some applications, this is undesirable or even unacceptable.
In summary, the strength and weaknesses of the two presented approaches are inverted. While the full band echo canceller benefits from zero algorithmic delay, it suffers from slow adaptation and high processing complexity. The sub-band echo canceller, however, benefits from faster adaptation and lower processing complexity, but suffers from an algorithmic delay.