Cochlear implant systems generally operate by receiving a sound signal, processing the received signal in order to extract information to be used as the basis for stimuli, and then generating the required stimuli for delivery by an intra-cochlear electrode array. The speech processing strategy is the process used to determine which information extracted from the sound signal is to be used as the basis for stimulation, and some of the characteristics of the stimuli to be applied.
There have been a number of speech processing strategies introduced with multi-channel cochlear implants that have used rates of stimulation synchronised to a speakers fundamental frequency. Due in the main to power considerations, such strategies only selected a few electrodes and stimulation occurred at a low F0 (fundamental frequency or repetition) rate.
Essentially, these strategies employed a filter channel dedicated to extracting the voice pitch of the speech signal. The periodicity of the voice pitch was used to set the stimulation periodicity for two or three electrodes. A second and possibly third channel was analysed to determine the frequency (periodicity) and amplitude (energy) within a frequency band. The periodicity extracted from the second and/or third filters was used to select which electrode was to be stimulated for the second and third channel. The periodicity of stimulation on these channels was the same for all channels and was determined from the periodicity of the output channel from the F0 filter. In each case the amplitude of the output signal from the corresponding filter determines the amplitude of the stimulation in a given channel.
As technology advances have allowed, higher stimulation rates have become progressively available through the 1990's and the F0 synchronous stimulation has been replaced with “high” rate stimulation strategies such as SPEAK and CIS which typically stimulate at a rate in the range of 250-3000 Hz per electrode.
The SPEAK strategy, which is described in U.S. Pat. No. 5,597,380 (and which is implemented in a number of speech processors produced by Cochlear Limited) employs a larger number of analysis filters and stimulates more electrodes each analysis period than the F0 synchronous strategy. During each analysis period, the SPEAK strategy interrogates the output of each one of an array of spectral analysis filters and stimulation is applied only to those electrodes corresponding tonotopically to the selected filters with the largest amplitude. In this case the frequency of stimulation of each individual electrode is variable depending upon the amplitude of the signals corresponding to each electrode.
The CIS strategy is described in U.S. Pat. No. 4,207,441. In this strategy there are n electrodes each coupled to one of n filters. Each electrode is stimulated once per analysis period, with an intensity corresponding to the amplitude of the corresponding filter channel. In this strategy the analysis period is predetermined and hence the frequency of stimulation for each electrode is more or less fixed.
More recently in PCT/AU00/00838 by the present applicant, there is described a strategy which extracts stimulation rates from the input signal and provides for stimulation of different electrodes at different rates (the multi rate scheme). The multi rate scheme sets the stimulation rate and amplitude for a selected electrode according to measurements of the signal characteristics in the corresponding filter band and also describes an arbitration scheme to deal with conflicting times of stimulation.
The multi rate scheme estimates the rate of stimulation in each band by measuring intervals between positive zero crossings of the filtered signal without regard to where in absolute time the zero crossings occur. This is intended to provide periodicity information to the user. In the multi rate scheme the overall rate is limited by smoothing of the rate estimates so that the absolute timing of the events is not captured.
A further issue that is not addressed by prior speech processing strategies is that of interaural time delays for binaural listening. With the strategies designed so far, filterbank methods have been used which discard the carrier phase in each band, and preserve (at best) the envelope modulations in the band. However, for normal hearing, it is often the case that for low frequencies at least, the carrier phase differences between the two ears are an important cue. Also, with fixed sampling rate strategies such as SPEAK and CIS, asynchronous sampling even of the envelope at the two ears can introduce inconsistent level cues between the two ears.
There have been a number of studies undertaken in the area of benefits associated with binaural listening, for example:    Bronkhorst, A. W., and Plomp, R., 1988, The effect of head-induced interaural time and level differences on speech intelligibility in noise, Journal of the Acoustical Society of America, 83, 1508-1516.    Bronkhorst, A. W., and Plomp, R., 1989, Binaural speech intelligibility in noise for hearing-impaired listeners, Journal of the Acoustical Society of America, 86, 1374-1383.    Carhart, R.,1965, Monaural and binaural discrimination against competing sentences, International Audiology, 4, 5-10.    Dirks, D. D. and Wilson, R. H., 1969, The effect of spatially separated sound sources on speech intelligibility, Journal of Speech and Hearing Research, 12, 5-38.    Hausler, R., Colburn, S., and Marr, E., 1983, Sound Localization in Subjects with Impaired Hearing, Acta Otolatyngologica, Supplement 400.    Hawley, M. L., Litovsky, R. Y., Colburn, H. S., 1999, Speech Intelligibility and localization in a multi-source environment, Journal of the Acoustical Society of America, 105, 3564-3448.    Licklider, J. C. R., 1948, The influence of interaural phase upon the masking of speech by white noise”, Journal of the Acoustical Society of America, 20, 150-159.    MacKeith, N. W, and Coles, R. R., 1971, Binaural advantages in hearing of speech, Journal of Latyngology and Otology, 85, 213-232.    Peissig, J. and Kollmeier., B., 1997, Directivity of binaural noise reduction in spatial multiple noise-source arrangements for normal hearing and impaired listeners, Journal of the Acoustical Society of America, 105, 1660-1670 Rayleigh, L., 1907, On our perception of sound direction, Philosophical Magazine, 13, 214-232.    Sayers, B. M., 1964, Acoustic-image lateralization judgments with binaural tones, Journal of the Acoustical Society of America, 36, 923-926.    Searle, C. L., Braida, L. D., Davis, M. F., and Colburn, H. S., 1976, Model for auditory localization., Journal of the Acoustical Society of America, 60, 1164-1175.    Wightman, F., and Kistler, D. J., 1992, The dominant role of low-frequency interaural time differences in sound localization, Journal of the Acoustical Society of America, 91, 1648-1661.
It is accepted that listening with two ears rather than one, for normal listeners, allows improved speech intelligibility in noise as well as the ability to better determine sound direction. Studies with both normal hearing and hearing impaired listeners have suggested that the binaural intelligibility level difference (BILD) is a function of both interaural level differences (ILD) and interaural time delays (ITD). Similarly, localisation in the horizontal plane has been shown to be a function primarily dependent on ILD and ITD cues at the two ears.
It is an object of the present invention to provide a speech processing strategy that provides improved temporal information for a cochlear implant user.