As shown in FIG. 1, sounds are transmitted by a human ear from the outer ear 101 to the tympanic membrane (eardrum) 102, which moves the bones of the middle ear 103 (malleus, incus, and stapes) that vibrate the oval window and round window openings of the cochlea 104. The cochlea 104 is a long fluid-filled duct wound spirally about its axis for approximately two and a half turns. It includes an upper channel known as the scala vestibuli and a lower channel known as the scala tympani, which are connected by the cochlear duct. The cochlea 104 forms an upright spiraling cone with a center called the modiolus where the spiral ganglion cells of the acoustic nerve 113 reside. In response to received sounds transmitted by the middle ear 103, the cochlea 104 functions as a transducer to generate electric pulses which are transmitted to the cochlear nerve 113, and ultimately to the brain which perceives the neural signals as sound.
Hearing is impaired when there are problems in the ability to transduce external sounds into meaningful action potentials along the neural substrate of the cochlea 104. To improve impaired hearing, auditory prostheses have been developed. In some cases, hearing impairment can be addressed by a cochlear implant (CI), a brainstem-, midbrain- or cortical implant that electrically stimulates auditory nerve tissue with small currents delivered by multiple electrode contacts distributed along an implant electrode. For cochlear implants, the electrode array is inserted into the cochlea 104. For brain-stem, midbrain and cortical implants, the electrode array is located in the auditory brainstem, midbrain or cortex, respectively.
FIG. 1 shows some components of a typical cochlear implant system where an external microphone provides an audio signal input to an external signal processor 111 which implements one of various known signal processing schemes. For example, signal processing approaches that are well-known in the field of cochlear implants include continuous interleaved sampling (CIS) digital signal processing, channel specific sampling sequences (CSSS) digital signal processing, spectral peak (SPEAK) digital signal processing, fine structure processing (FSP) and compressed analog (CA) signal processing.
The processed signal is converted by the external signal processor 111 into a digital data format, such as a sequence of data frames, for transmission by an external coil 107 into a receiving stimulator processor 108. Besides extracting the audio information, the receiver processor in the stimulator processor 108 may perform additional signal processing such as error correction, pulse formation, etc., and produces a stimulation pattern (based on the extracted audio information) that is sent through electrode lead 109 to an implanted electrode array 110. Typically, the electrode array 110 includes multiple stimulation contacts 112 on its surface that provide selective electrical stimulation of the cochlea 104.
An audio signal, such as speech or music, can be processed into multiple frequency band pass signals, each having a signal envelope and fine time structure within the envelope. One common speech coding strategy is the so called “continuous-interleaved-sampling strategy” (CIS), as described by Wilson B. S., Finley C. C., Lawson D. T., Wolford R. D., Eddington D. K., Rabinowitz W. M., “Better speech recognition with cochlear implants,” Nature, vol. 352, 236-238 (July 1991), which is hereby incorporated herein by reference. The CIS speech coding strategy samples the signal envelopes at predetermined time intervals, providing a remarkable level of speech understanding merely by coding the signal envelope of the speech signal. This can be explained, in part, by the fact that auditory neurons phase lock to amplitude modulated electrical pulse trains (see, for example, Middlebrooks, J. C., “Auditory Cortex Phase Locking to Amplitude-Modulated Cochlear Implant Pulse Trains,” J Neurophysiol, 100(1), p. 76-912008, 2008 July, which is hereby incorporated herein by reference).
However, for normal hearing subjects, both signal cues, the envelope and the final time structure, are important for localization and speech understanding in noise and reverberant conditions (Zeng, Fan-Gang, et al. “Auditory perception with slowly-varying amplitude and frequency modulations.” Auditory Signal Processing. Springer New York, 2005. 282-290; Drennan, Ward R., et al. “Effects of temporal fine structure on the lateralization of speech and on speech understanding in noise.” Journal of the Association for Research in Otolaryngology 8.3 (2007): 373-383; and Hopkins, Kathryn, and Brian Moore. “The contribution of temporal fine structure information to the intelligibility of speech in noise.” The Journal of the Acoustical Society of America 123.5 (2008): 3710-3710; and all of which are hereby incorporated herein by reference in their entireties).
Older speech coding strategies mainly encode the slowly varying signal envelope information and do not transmit the fine time structure of a signal. More recent coding strategies, for example, Fine Structure Processing (FSP), also transmit the fine time structure information. In FSP, the fine time structure of low frequency channels is transmitted through Channel Specific Sampling Sequences (CSSS) that start at negative to positive zero crossings of the respective band pass filter output (see U.S. Pat. No. 6,594,525, which is incorporated herein by reference). The basic idea of FSP is to apply a stimulation pattern, where a particular relationship to the center frequencies of the filter channels is preserved, i.e., the center frequencies are represented in the temporal waveforms of the stimulation patterns, and are not fully removed, as is done in CIS. Each stimulation channel is associated with a particular CSSS, which is a sequence of ultra-high-rate biphasic pulses (typically 5-10 kpps). Each CSSS has a distinct length (number of pulses) and distinct amplitude distribution. The length of a CSSS may be derived, for example, from the center frequency of the associated band pass filter. A CSSS associated with a lower filter channel is longer than a CSSS associated with a higher filter channel. For example, it may be one half of the period of the center frequency. The amplitude distribution may be adjusted to patient specific requirements.
For illustration, FIG. 2A-2B show two examples of CSSS for a 6-channel system. In FIG. 2A, the CSSS's are derived by sampling one half of a period of a sinusoid whose frequency is equal to the center frequency of the band pass filter (center frequencies at 440 Hz, 696 Hz, 1103 Hz, 1745 Hz, 2762 Hz, and 4372 Hz). Sampling is achieved by means of biphasic pulses at a rate of 10 kpps and a phase duration of 25 μs. For Channels 5 and 6, one half of a period of the center frequencies is too short to give space for more than one stimulation pulse, i.e., the “sequences” consist of only one pulse, respectively. Other amplitude distributions may be utilized. For example, in FIG. 2B, the sequences are derived by sampling one quarter of a sinusoid with a frequency, which is half the center frequency of the band pass filters. These CSSS's have about the same durations as the CSSS's in FIG. 2A, respectively, but the amplitude distribution is monotonically increasing. Such monotonic distributions might be advantageous, because each pulse of the sequence can theoretically stimulate neurons at sites which cannot be reached by its predecessors.
FIG. 3 illustrates a typical signal processing implementation of the FSP coding strategy. A Preprocessor Filter Bank 301 processes an input sound signal to generate band pass signals that each represent a band pass channel defined by an associated band of audio frequencies. The output of the Preprocessor Filter Bank 301 goes to an Envelope Detector 302 that extracts band pass envelope signals reflecting time varying amplitude of the band pass signals which includes unresolved harmonics and are modulated with the difference tones of the harmonics, mainly the fundamental frequency F0, and to a Stimulation Timing Module 303 that generates stimulation timing signals reflecting the temporal fine structure features of the band pass signals. For FSP, the Stimulation Timing Module 303 detects the negative to positive zero crossings of each band pass signal and in response starts a CSSS as a stimulation timing signal. A Pulse Generator 304 uses the band pass envelope signals and the stimulation timing signals to produce the electrode stimulation signals for the electrode contacts in the implant 305.
FSP and FS4 are the sole commercially available coding strategies that code the temporal fine structure information. Although they have be shown to perform significantly better than e.g. CIS in many hearing situations, there are some other hearing situations in which no significant benefit has been found so far over CIS-like envelope-only coding strategies, in particular with regard to localization and speech understanding in noisy and reverberant conditions.
Temporal fine structure might be more affected by noise than the envelope is. It might be beneficial to use fine structure stimulation depending, for example, on the signal of noise ratio or on the dynamic reverberation ratio. In existing coding strategies, the use of the temporal fine structure is adapted in a post-surgical fitting session and is not adaptive to the signal to noise ratio.