The human hearing process is intimately related to the function of the cochlea. The cochlea is a spiral tube in the inner ear resembling a snail shell. It has nerve endings which translate incoming pressure signals to neural signals. The cochlea is essential to the hearing process.
As can be expected, electronic models simulating the cochlea's functions are widely studied by the speech and hearing research community in order to develop speech synthesis, analysis, and recognition systems. One of the long term attractions of cochlea model based signal processing is the promise of increased recognition performance, especially in variable and high noise environments.
In the past, models were based on the mechanical motion of the cochlea's basilar membrane to various degrees of fidelity. These types of algorithms have demonstrated a degree of restricted recognition. However, these prior art approaches often failed to effectively handle sounds other than pure, simple speech sounds. Consequently, increasingly complex models are being designed in an effort to distinguish between complex single sounds and separate unfusible sounds with similar short-term spectra. One goal is to design an extremely precise model such that these different types of sounds can accurately and reliably be decoded.
For example, some models employ cascaded filter sections to mimic the behavior of the cochlea to preserve those aspects of sound most relevant to sound separation and speech parameterization. In this respect, digital models and hence, digital filters, are attractive because of their suitability for implementation in many digital computer architectures.
A digital filter is characterized in the z domain by the z-plane complex number locations of its poles (i.e., roots of the denominator of its z-transform transfer function), its zeros (i.e., roots of the numerator of its z-transform transfer function), and by a gain parameter. The "order" of the filter is the highest exponent of z in either the numerator or the denominator of its transfer function. It is sufficient to analyze quadratic filters having an order of two, with complex conjugate pairs of poles and zeros, referred to as second-order sections, and to filters having an order of one, with real poles and zeros, referred to as first-order sections, because these sections can be cascaded with other similar sections to make arbitrary filters of higher order.
A second-order section with complex poles and zeros can be characterized by the frequency and damping of its poles and zeros. For most purposes, second-order filters can be considered to have only poles or zeros, rather than both. Consequently, the frequency and damping of second-order filters can be characterized by a complex pair of poles or a complex pair of zeros. A second-order filter that combines poles and zeros is said to be bi-quadratic and has separate frequency and damping parameters for its poles and for its zeros. A filter with poles is said to be recursive, since it uses feedback of previously computed values to compute subsequent values. A filter with only zeros is said to be nonrecursive, since it has no feedback loop. The art of digital filters is well described in recent textbooks, including Discrete-Time Signal Processing by A. V. Oppenheim and R. W. Schafer, Prentice Hall, Englewood Cliffs, N.J., 1989. One skilled in the art will realize that digital filters can also be applied to discrete-time analog implementations, such as by using charge-coupled devices.
Ideally, to model nonlinear adaptation in the cochlea, digital filters should have adaptable damping characteristics. Damping relates to the temporal rate of decay of a filter's impulse response, and is related to the Q or quality factor as d=1/(2Q). It is highly preferable for the damping factor of a digital filter to be capable of being controlled and varied. This can be accomplished by selecting a particular filter form and calculating the filter coefficients based on the desired damping and frequency parameters. However, these calculations tend to be quite complicated and usually involve trigonometric functions, square roots, etc., which are non-linear.
When these filters are implemented in systems operating in real time at relatively fast sample rates (e.g., 20 kilohertz), complications arise. Each time the damping is changed, the filter is typically required to repeat the complex calculations, which imposes a high cost in terms of hardware units and/or computation cycles. Hence, typical prior art digital filters lack the speed to vary the parameters for each sample in real time. Instead, the complicated calculations for changing the filter parameters are performed only periodically. As a result, the filter parameters change in incremental jumps, rather than changing smoothly over time. This compromise results in a non-ideal real-time implementation.
In addition, many prior art digital filters with simple structures exhibit high coefficient sensitivity. In other words, frequency and damping parameters must be very precisely converted to filter coefficients in order to assure that the resulting filter frequency and damping match the desired parameters to a high degree of accuracy. However, to mimic the cochlea, one would like to have a digital filter directly controlled by frequency and damping parameters, wherein its parameters can be varied independently in real time without affecting other parameters. Moreover, it would be highly advantageous for such a filter to also exhibit low sensitivity to small errors in the parameters.
A technique recently published in "ASIC Implementation of the Lyon Cochlea Model" by C. D. Summerfield and R. F. Lyon in Proc. Int'l Conf. on Acoustics, Speech, and Signal Processing, IEEE, 1992, is a first step toward the goals set forth above, but it suffers from very extreme overdamping and even instability when frequency and damping parameters are both set significantly above zero.
Therefore, what is needed is a digital filter which can be implemented in real-time in a relatively economical fashion. It would be highly preferable if the filter's center frequency and damping parameters can both be independently varied over a wide range.