A number of technical areas currently exist where digital signal processing, and in particular, adaptive filtering is used. The technical areas include network and acoustic echo cancellation, channel equalization, noise reduction used in cellular and hands-free telephones, teleconference systems and IP telephony. Issues arise while trying to reduce a given amount of an undesirable echo by modeling the echo on the basis of single-side speech and a known echo. Linear adaptive filtering techniques are normally used to solve the problem.
Different approaches exist to adjust the linear adaptive filtering using adaptation techniques. Choice of a concrete adaptation technique and corresponding input and runtime parameters affect the performance of the whole system. A commonly used technique of adapting filtering is a Normalized Least Mean Square (NLMS) technique. However, the convergence rate of the NLMS technique is slow. Other techniques currently in use either are numerically intensive (i.e., the Affine Projection Algorithm (APA)) or have numerical problems (i.e., the Recursive Least Mean Squares (RLS) technique). Multi-segmental filter coefficient approaches are used in practical implementations to reduce the overall system load and to increase convergence speed. However, such approaches use additional computations dealing with multiple window segment boundaries.
A typical situation for real environments, like network echo canceling systems, supposes a variable Millions of Instructions Per Second (MIPS) processor budget for each signal channel. The budget depends mostly on a current overall system load (i.e., number of active channels and peak numerical performance of each channel). Therefore, problems arise controlling a balance between (i) convergence speed and quality and (ii) the MIPS utilized in each voice channel.
The network echo cancellation problems are characterized by long echo path impulse responses, only small portions of which relate to basic sound reflections (i.e., filter windows) that contain filter energy. Nevertheless, the adaptive echo cancellers are designed to synthesize the full echo path because the actual locations of filter windows (including starting delay) of the echo path is unknown.
The NLMS technique gives echo vector estimates on a sample-by-sample basis so that a normalized mean square error (i.e., e(k)) is minimized in a least mean square sense according to the following set of formulae:y(k)=wT(k)X(k);e(k)=d(k)−y(k); andw(k+1)=w(k)+μe(k)X(k)/(XT(k)X(k)).The variable d(k) is an echo sample received by the canceller, w(k)=[w0(k), w1(k), . . . , w(N-1)(k)]T is an N sample long echo path estimate vector, μ is the step-size parameter and X(k) is a vector representing the last N input samples: X(k)=[x(k−N+1), . . . , x(k)]T. The vector w is initialized as w(1)=[0, . . . 0].
Computational complexity of a step of the NLMS technique is 2N+2 multiplications, where an N corresponds to a filter update phase of the NLMS technique, the other N corresponds to filtering stage and XT(k+1)X(k+1) can be obtained from XT(k)X(k) by means of two additional multiplications as follows: XT(k+1)X(k+1)=XT(k)X(k)+x(k)2−x(k−N+2)2. The computational complexity of the NLMS technique is less than that of the APA, Fast Affine Projection (FAP) and the RLS techniques. However, the NLMS technique shows a slower convergence rate when using test vectors according to an ITU-T Recommendation G.168.
Another adaptive filtering technique is a Proportionate Normalized Least Mean Square (PNLMS). The PNLMS technique is defined by a set of formulae as follows:y(k)=wT(k)X(k);e(k)=d(k)−y(k); andw(k+1)=w(k)+μG(k)e(k)X(k)/(XT(k)G(k)X(k))where G is a diagonal matrix with a diagonal vector g. The diagonal vector g is almost proportional to the vector w(k). More precisely, the diagonal vector g is computed using the following formulae:L∞(k)=max{δp, |w0(k)|, |w1(k)|, . . . , |w(N-1)(k)};γi(k)=max{ρL∞(k), |wi(k)|}, 0≦i≦N−1; andgi(k)=γi(k)/Σγi(k)where δp=0.01 and ρ=5/N. The complexity of the PNLMS technique is 6N per step. The convergence rate of the PNLMS technique is twice as fast as the convergence rate of the NLMS technique. Assuming G=I (i.e., I is an identity matrix), the PNLMS technique performs the same as the NLMS technique.
Although a length of the vector w(k) is large (i.e., about 1000 components), in practice the vector w(k) consists of several windows with non-zero energy. All of the remaining components of vector w(k) can be considered to have zero energy without significant loss of echo cancellation quality. Therefore, if a number (i.e., R) of non-zero samples of the vector w(k) are known, the remaining samples are not updated. In such a case, the computational complexities of the NLMS technique and the PNLMS technique become 2R and 6R correspondingly. The complexity reduction leads to partial-update techniques that use less computational resources than the pure NLMS and PNLMS techniques. Furthermore, the partial-update techniques have acceptable convergence rates. Most known partial-update techniques assume that the additions of the vector w(k+1) always consists of blocks of the same length (i.e., L), where L=N/M. Only some number (i.e., B) of the blocks have non-zero energy, where B≦M. For example, a Selective Partial Update (SPU) technique is defined by the following set of formulae:y(k)=wTIB(k)X(k);e(k)=d(k)−y(k); andwIB(k+1)=w=IB(k)+μGIB(k)xIB(k)e(k)/(xTIB(k)GIB(k)xIB(k))where IB={i: xT1(k)Gi(k)x1(k) is one of the B largest among xT1(k)G1(k)x1(k), . . . , xTM(k)GM(k)xM(k)}.
The computational complexity of the SPU+PNLMS technique can be shown to be BL+5N+M log2 (M). The last component of the complexity corresponds to the complexity of selecting B blocks having the largest weights (i.e., defining the set IB). Note that an update of the matrix G still occurs at every step of the technique. A such, the above generalization of the PNLMS technique is not efficient from a computational point of view. The SPU+PNLMS technique works slightly faster than the PNLMS technique, but converges slower.
Assuming G=I, the SPU+PNLMS technique reduces to the SPU+NLMS technique, which is a generalization of the NLMS technique based on a selective partial update approach. Without taking into account the complexity of searching for an optimal set of the blocks IB, the SPU+NLMS technique will have a complexity of about BL+N. The overall computational complexity of the SPU+NLMS technique will be B(L+2)+N+M log2 (M). Reduction of the complexity occurs because block sorting is now based on weights (i.e., Vi) that can be efficiently updated. Denoting the weight of the i-th block of the input signal as Vi=x(k−N+(i−1)×L+1)2+x(k−N+(i−1)×L+2)2+ . . . +x(k−N+i×L)2, each weight can be updated by two multiplications. The resulting complexity is less than 2N for some typical parameters B and M. For example, if N=1024, L=32, B=8 and M=32, the complexity of the SPU+NLMS technique will be 272+1024+160=1456 processor operations, while the complexity of the NLMS technique will be 2048 processor operations.