1. Field of the Invention
The present invention relates generally to an apparatus and method for canceling an echo signal in a mobile communication system. In particular, the present invention relates to an apparatus and method for canceling an echo signal in a mobile terminal of a mobile communication system.
2. Description of the Related Art
In general, a mobile communication system transmits voice signals with a predetermined frequency band in order to provide mobility for a mobile terminal. Such a mobile terminal of the mobile communication system transmits voice signals in many different environments. For example, the mobile terminal can serve as a telephone in a home or an office, and can enable a voice call in an automobile or a subway. When the mobile terminal (also known as a mobile communication terminal and a voice communication terminal) performs a call in a teleconferencing mode or a normal call mode, an acoustic echo signal and noise having different paths are input into its microphone. An increase in the strength of the acoustic echo signal causes a howling phenomenon, which decreases the call quality. Therefore, the mobile terminal includes an echo canceller in order to cancel the acoustic echo signal. The acoustic echo signal is generated as a voice signal of the other party in the conversation with a person talking on the mobile terminal. The generated signal is fed back to a microphone through a speaker. In the following description, the other party will be referred to as a “remote talker” and the person talking on the mobile terminal will be referred to as a “local talker.”
Any background noise also serves as another major cause of reduced speech quality in a voice call. The background noise refers to the day-to-day environmental noise, such as traffic, wind noise, television and other people talking. The background noise, unless cancelled out, causes call quality to deteriorate by creating noise, like the echo signal. In order to cancel the background noise as well as the echo signal, much research has been and continues to be conducted.
Now, with reference to FIG. 1, a description will be made of a structure and operation of an echo canceling apparatus used in a mobile terminal.
FIG. 1 is a block diagram illustrating a structure of an echo canceling apparatus used in a conventional mobile terminal. Referring to FIG. 1, a remote talker's signal (hereinafter referred to as a “remote-talker signal”) x(k) means a signal, which is converted into an electric voice signal after being decoded. The remote-talker signal is converted into an audible signal through a speaker 110. Also, the remote-talker signal x(k) is input to an echo canceller 130. The echo canceller 130 generates an estimated echo signal ŷ(k) using an error signal e(k) which will be described later in more detail below. A local talker's signal (hereinafter referred to as a “local-talker signal”) s(k) and a background noise n(k) are input to a microphone 120. In addition, an echo signal y(k) described above is input to the microphone 120. The echo signal y(k) is derived from the remote-talker signal x(k) through the speaker 110, and is input to the microphone 120. Therefore, the echo signal y(k) is input to the microphone after a time delay. Because the echo signal y(k) is output through the speaker 110, it is a distorted signal of the remote-talker signal x(k).
When the local-talker signal s(x), the background noise n(k) and the echo signal y(k) are input to the microphone 120, the microphone 120 converts the input signals into an electric signal d(k). An adder 140 is used to cancel the echo signal y(k) from the electric signal d(k). The adder 140 cancels the estimated echo signal ŷ(k) output from the echo canceller 130 from the electric signal d(k). That is, the adder 140 outputs a difference between the electric signal d(k) and the estimated echo signal ŷ(k). The difference signal is called an error signal e(k). The error signal e(k) can be defined as
                                                                        e                ⁡                                  (                  k                  )                                            =                            ⁢                                                s                  ⁡                                      (                    k                    )                                                  +                                  n                  ⁡                                      (                    k                    )                                                  +                                  y                  ⁡                                      (                    k                    )                                                  -                                                      y                    ^                                    ⁡                                      (                    k                    )                                                                                                                                          =                                ⁢                                                      s                    ⁡                                          (                      k                      )                                                        +                                      n                    ⁡                                          (                      k                      )                                                        +                                      r                    ⁡                                          (                      k                      )                                                                                  ,              where                                                                                          r                ⁡                                  (                  k                  )                                            =                            ⁢                                                y                  ⁡                                      (                    k                    )                                                  -                                                      y                    ^                                    ⁡                                      (                    k                    )                                                                                                          Equation        ⁢                                  ⁢                  (          1          )                    
It can be understood from Equation (1) that as r(k) is smaller, the echo canceller 130 has better performance. In the following description, the r(k) will be referred to as a “residual echo signal.” The echo signal e(k) of Equation (1) including the residual echo signal r(k) is input to a noise canceller 150, and the noise canceller 150 outputs a noise-cancelled local-talker signal ŝ(k). The noise canceller 150 cancels the background noise n(k) and the residual echo signal r(k) from the electric signal d(k).
A description will now be made of a general noise cancellation technique performed in the noise canceller 150. The general noise cancellation technique cancels the background noise n(k) on the following assumptions.
Assumption 1: The background noise n(k) is a wise sense stationary (WSS) signal.
Assumption 2: The background noise n(k) varies with the passage of time.
On the foregoing assumptions, the background noise n(k) can be cancelled as follows. When a voice signal is mixed with a time-varying white nose or color noise, it is possible to obtain a clean voice signal by estimating and canceling only the noise. Therefore, when the residual echo signal r(k) is a white or color noise having a WSS characteristic like the background noise n(k), it can be cancelled using background noise estimation. However, when a source of the echo signal y(k) is a voice signal, the residual echo signal r(k) generated as an output of the echo canceller 130 has a characteristic of a voice signal. Therefore, a voice activity detector (VAD) included in the noise canceller 150 mistakes the residual echo signal r(k) for a voice signal.
A method recently proposed to more accurately cancel the residual echo signal r(k) whitens the residual echo signal r(k) into a pseudo-white noise through inverse filtering for which coefficients obtained through Auto-Regressive (AR) analysis of a residual echo are used, thereby maximizing the effect of the noise canceller arranged in the following stage.
A description will now be made of the recently proposed method for canceling noise through AR analysis.
The method for whitening the residual echo signal r(k) into a pseudo-white nose through AR analysis before noise cancellation determines whether there is a local-talker signal s(k), and arranges an AR coefficient-based inverse filter at a following stage of a double talk detector (DTD). An error signal e(k) free of a local-talker signal s(k) is sent to the AR analysis and inverse filtering part where its AR coefficient is estimated. Thereafter, the method performs inverse filtering on the estimated AR coefficient, generating a whitened signal we(k)=wn(k)+wr(k), and sends the whitened signal we(k) to the noise canceller. In this manner, the method cancels the residual echo signal together with the background noise.
However, when the remote-talker signal s(k) is a voice signal, the residual echo signal r(k) also has a characteristic of a voice signal. Therefore, a solution for inverse filtering can be approximated with a pth-order AR model in accordance with Equation (2).
                                                        w              r                        ⁡                          (              k              )                                =                      -                                          ∑                                  i                  =                  0                                P                            ⁢                                                                    a                    ^                                    ⁡                                      (                    i                    )                                                  ⁢                                  r                  ⁡                                      (                                          k                      -                      i                                        )                                                                                      ,                              where            ⁢                                                  ⁢                                          a                ^                            ⁡                              (                0                )                                              =          1                                    Equation        ⁢                                  ⁢                  (          2          )                    
In Equation (2), â(i) denotes an estimated AR coefficient, wr(k) denotes a whitened signal of r(k), and wn(k) denotes a whitened signal of n(k).
The AR analysis-based inverse filtering method whitens a residual echo signal into a white noise signal, regarding it as a voice signal, and mixes the white noise with a background noise transmitted through a microphone so that the noise canceller cancels the white noise. However, although the residual echo signal is actually subject to inverse filtering using AR analysis, a periodic component such as a voiced sound remains uncancelled. Because the periodic component like the voiced sound, which is a pitch component, occurs at stated periods, it cannot be completely whitened, thereby causing deterioration in the quality of the call.