Cochlear implants and other inner ear prostheses are one option to help profoundly deaf or severely hearing impaired persons. Unlike conventional hearing aids that just apply an amplified and modified sound signal; a cochlear implant is based on direct electrical stimulation of the acoustic nerve. Typically, a cochlear implant stimulates neural structures in the inner ear electrically in such a way that hearing impressions most similar to normal hearing are obtained.
FIG. 1 shows a section view of an ear with a typical cochlear implant system. A normal ear transmits sounds through the outer ear 101 to the eardrum 102, which moves the bones of the middle ear 103, which in turn excites the cochlea 104. The cochlea 104 includes an upper channel known as the scala vestibuli 105 and a lower channel known as the scala tympani 106, which are connected by the cochlear duct 107. In response to received sounds transmitted by the middle ear 103, the fluid filled scala vestibuli 105 and scala tympani 106 function as a transducer to transmit waves to generate electric pulses that are transmitted to the cochlear nerve 113, and ultimately to the brain. Frequency processing seems to change in nature from the basal region of the cochlea, where the highest frequency components of a sound are processed, to the apical regions of the cochlea, where the lowest frequencies are analyzed.
Some persons have partial or full loss of normal sensorineural hearing. Cochlear implant systems have been developed to overcome this by directly stimulating the user's cochlea 104. A typical cochlear prosthesis essentially includes two parts: the speech processor and the implanted stimulator 108. The speech processor (not shown in FIG. 1) typically includes a microphone, a power supply (batteries) for the overall system and a processor that is used to perform signal processing of the acoustic signal to extract the stimulation parameters. In state-of-the art prostheses, the speech processor is a behind-the-ear (BTE-) device. The implanted stimulator generates the stimulation patterns and conducts them to the nerve tissue by means of an electrode array 110 which usually is positioned in the scala tympani in the inner ear. The connection between speech processor and stimulator is usually established by means of a radio frequency (RF-) link. Note that via the RF-link both stimulation energy and stimulation information are conveyed. Typically, digital data transfer protocols employing bit rates of some hundreds of kBit/s are used.
One example of a standard stimulation strategy for cochlear implants is called the “Continuous-Interleaved-Sampling (CIS)” strategy, as described by Wilson B S, Finley C C, Lawson D T, Wolford R D, Eddington D K, Rabinowitz W M, “Better speech recognition with cochlear implants,” Nature, vol. 352, 236-238, July 1991, which is incorporated herein by reference in its entirety. Signal processing for CIS in the speech processor typically involves the following steps:
1. Splitting up of the audio frequency range into spectral bands by means of a filter bank,
2. Envelope detection of each filter output signal,
3. Instantaneous nonlinear compression of the envelope signal (map law), and.
4. Adaptation to thresholds (THR) and most comfortable loudness (MCL) levels
Each of the stimulation electrodes in the scala tympani is typically associated with a band pass filter of the external filter bank. According to the “tonotopic principle” of the cochlea, high frequency bands are associated with electrodes positioned more closely to the base, and low frequency bands to electrodes positioned more deeply in the direction of the apex, as described by Greenwood DD, “A cochlear frequency-position function for several species—29 years later,” J. Acoust. Soc. Am., 2593-2604, 1990, which is incorporated herein by reference in its entirety. For stimulation, charge balanced current pulses—usually biphasic symmetrical pulses—are applied. The amplitudes of the stimulation pulses are obtained by sampling the compressed envelope signals. As the characteristic CIS paradigm, For stimulation, symmetrical biphasic current pulses are applied. The amplitudes of the stimulation pulses are directly obtained from the compressed envelope signals (step (3) of above). These signals are sampled sequentially, and the stimulation pulses are applied in a strictly non-overlapping sequence. Thus, as a typical CIS-feature, only one stimulation channel is active at one time. The overall stimulation rate is comparatively high. For example, assuming an overall stimulation rate of 18 kpps, and using an 12 channel filter bank, the stimulation rate per channel is 1.5 kpps. Such a stimulation rate per channel usually is sufficient for adequate temporal representation of the envelope signal.
The influence of various CIS-parameters on speech perception, such as the number of channels and the stimulation rate per channel, etc. has been investigated (see for example: Loizou P C, Poroy O, Dorman M, The effect of parametric variations of cochlear implant processors on speech understanding,” J. Acoust. Soc Am. 2000 August; 108(2):790-802; and Wilson B, Wolford R, Lawson D, Speech processors for Auditory prostheses—Seventh quarterly progress report. NIH Project N01-DC-8-2105, each of which is incorporated herein by reference in their entirety) and new concepts aiming at a further improvement have been proposed. For example, one approach is based on the principle of stochastic resonance (see, for example: McNamara B and Wiesenfeld K, “Theory of stochastic resonance” Phys. Rev. A, 39:4854-4869; Rubinstein J T, Wilson B S, Finley C C, Abbas P J, “Pseudospontaneous activity: stochastic independence of auditory nerve fibers with electrical stimulation,” Hear. Res. 127, 108-118, 1999; and Morse R P and Evans E F, “Additive noise can enhance temporal coding in a computational model of analogue cochlear implant stimulation,” Hear. Res. 133, 107-119, 1999, each of which is incorporated herein by reference in its entirety). The basic idea is to mimic spontaneous activity in the neurons to provide a more natural representation of the envelope signals in the spiking patterns. However, so far this and other approaches have not found their way into broad clinical applications, mainly because no substantial improvement in of CT performance as compared to CIS has been found.
At present, the incorporation of so called “fine structure information” seems to be the most promising way to further improve CIS. Following Hilbert (i.e., Hilbert D, “Grundzüge einer allgemeinen Theorie linearer Integralgleichungen,” Teubner, Leipzig, 1912, incorporated herein by reference in its entirety), any signal can be represented as the product of a slowly varying envelope and a rapidly varying signal containing temporal fine structure. The current CIS strategy uses only the envelope information; the fine structure information is discarded. In the response of a CIS band pass filter, temporal fine structure information is represented by position of the zero crossings of the signal and tracks the exact spectral position of the center of gravity of the signal within its band pass region, including temporal transitions of such centers of gravity. For example, the temporal transitions of formant frequencies in vowel spectra are highly important cues for the perception of preceding plosives or other unvoiced utterances. Furthermore, a close look at the details of a band pass filter output reveals that the pitch frequency is clearly present in the temporal structure of the zero crossings. The relative importance of envelope and fine structure information is investigated in an experiment described in Smith Z M, Delgutte B, Oxenham A J, “Chimaeric sounds reveal dichotomies in auditory perception,” Nature, vol. 416, 87-90, March 2002, which is incorporated herein by reference in its entirety. There is some consensus that for an intermediate number of 4 to 16 processing channels, the envelope is most important for speech reception whereas the temporal fine structure is most important for pitch perception (melody recognition) and sound localization.
In the light of these results, standard CIS is a good choice with respect to speech intelligibility (e.g., for American English). However, regarding music perception and perception of so called tone languages (e.g., Mandarin Chinese, Cantonese, Vietnamese, That, etc.), CIS might be suboptimal and new stimulation strategies containing both envelope and temporal fine structure information might have the potential for a substantial improvement of CT performance. This assumption is supported, for example, by the study in F G Zeng, K B Nie, S Liu, G S Stickney, E Del Rio, Y Y Kong, H B Chen, “Speech recognition with amplitude and frequency modulations,” Proc. nat. acad. of science 102: 2293-2298, 2005, which is incorporated by reference in its entirety, where it is demonstrated that slowly varying frequency modulations can be perceived by cochlear implant subjects and thus an appropriate incorporation in future stimulation strategies is recommended.
Considering such new stimulation strategies it is clear that an increase of information will require higher pulse repetition rates per channel. Adhering to the basic CIS-paradigm of strictly non-overlapping pulses, an increase in pulse rates can only be achieved if the pulse durations get shorter. However, the pulse duration cannot be reduced arbitrarily, because shorter pulses require higher pulse amplitudes for sufficient loudness, and pulse amplitudes are limited for various practical reasons, such as a maximum implant supply voltage. Besides, there exists a fundamental neural time constant due to properties of nodes of Ranvier in myelinated nerve fibers, which is about τ=20-30 μs in the auditory nerve (see, for example, Frijns J and ten Kate J, “A model of myelinated nerve fibers for electrical prosthesis design,” Med. Biol. Eng. Comput., vol 32, pp. 391-398, 1994, which is incorporated herein by reference in its entirety). Although the response of the transmembrane potentials to a stimulation pulse is faster than the response of a simple first order system (“spectral acceleration,” see, for example, Zierhofer C M, “Analysis of a linear model for electrical stimulation of axons—critical remarks on the “activating function concept”,” IEEE Trans. BME, Vol. 48, No. 2, February 2001, which is incorporated herein by reference in its entirety), phase durations significantly shorter than τ should be avoided in order to avoid current shortcuts due to the membrane capacitances.