1. Field of the Invention
The present invention relates to an echo canceler and an echo canceling method and program that are, applicable to, for example, a hands-free telephone terminal.
2. Description of the Related Art
With the recent proliferation of voice over Internet protocol (VoIP) telephony, telephone rates have come down and people can make telephone calls from their homes and offices with reduced concern about telephone bills. As a result, in many cases the calls last longer. This has led to a sudden rise in the demand for hands-free telephone sets as a means of avoiding the discomfort caused by holding a telephone receiver pressed against the ear for an extended period of time.
The simplest types of hands-free telephone sets employ earphones or headsets, but earphones rub against the ear canal and cause painful inflammation, while headsets cause irritation and fatigue if worn for a long time. Headsets are often used at call centers to enable an operator to operate a personal computer while dealing with calls from customers, but headsets are not suitable for general use in the home, where the user simply wants to be able hold extended telephone conversations without physical stress or discomfort. The most popular type of hands-free telephone set is therefore the speakerphone, which employs a loudspeaker instead of an earphone or headsets.
An essential part of a speakerphone is an acoustic echo canceler that removes the echo of the acoustic output of the loudspeaker from the signal input through the microphone. An essential part of an acoustic echo canceler is its adaptive filter, which has tap coefficients that mimic the effect of the acoustic echo path. A key part of the adaptive filter is the algorithm used to update the tap coefficients for optimum echo cancellation.
Many acoustic echo cancelers employ the normalized least mean squares (NLMS) algorithm, described by Haykin in Introduction to Adaptive Filters (Macmillan, June 1984 (Japanese translation published by Gendaikogakusha, September 1987). The NLMS algorithm has the advantage of excellent stability, which offsets its disadvantage of relatively slow convergence for so-called ‘colored signals’ with a non-flat frequency spectrum. Voice signals are typically colored in this sense.
Although the NLMS algorithm remains an excellent choice for some purposes, the spread of VoIP has led to the introduction of wideband telephony, which provides better speech quality than conventional telephony. In wideband telephony, the sampling rate of the speech signal is typically doubled from eight thousand samples per second (8 kHz) to sixteen thousand samples per second (16 kHz). As a result, the number of filter taps is also doubled, causing the NLMS algorithm to converge even more slowly. Haykin also describes a computationally more advanced recursive least squares (RLS) algorithm that converges faster than the NLMS algorithm, but besides requiring extensive computation and vast amounts of memory, the RLS algorithm lacks the stability of the NLMS algorithm. Since the RLS algorithm is expensive to implement and fails to provide stable voice quality, it is unsuitable for general use in telephone sets in the home.
In Japanese Patent Application Publication No. 08-237174, Igai discloses a method of overcoming these problems by continuous optimization of the step gain in the NLMS algorithm. A large initial step gain is employed, so that the algorithm starts by converging quickly. As convergence progresses, the step gain is reduced so that the algorithm can model the echo accurately under steady-state conditions.
Continuous optimization of the step gain, however, fails to solve the problem of poor convergence for colored signals, and introduces new problems. For example, if voice input is preceded by a call control tone, and if the algorithm converges while the call control tone is being received, then the reduced step size delays adaptation to the echo characteristics of the voice signal.
In Japanese Patent Applications No. 2007-288404 and 2008-063086, filed by the present applicant, an attempt is made to solve these problems by providing an echo canceler that uses the stable NLMS algorithm to update the tap coefficients, but also stores received signal data and echo data as vector data and uses averaged vector data to carry out a simulated convergence process while the far-end party in the telephone conversation is silent, when the operation of the NLMS algorithm is conventionally suspended.
Although this strategy can boost convergence speed by allowing convergence to continue with simulated data, the inherent periodicity of the far-end speech signal can cause the averaged far-end vector data to cancel out. Because of this type of self-cancellation, adequate echo canceling performance is not always obtained.