Pulsatile multi-channel cochlear implant systems generally include a cochlear implant, an external speech processor, and an external headset. The cochlear implant delivers electrical stimulation pulses to an electrode array (e.g., 22 electrodes) placed in the cochlea. The speech processor and headset transmit information and power to the cochlear implant.
The speech processor operates by receiving an incoming acoustic signal from a microphone in the headset, or from an alternative source, and extracting from this signal specific acoustic parameters. Those acoustic parameters are used to determine electrical stimulation parameters, which are encoded and transmitted to the cochlear implant via a transmitting coil in the headset, and a receiving coil forming part of the implant.
In many people who are profoundly deaf, the reason for deafness is absence of, or destruction of, the hair cells in the cochlea which transduce acoustic signals into nerve impulses. These people are thus unable to derive any benefit from conventional hearing aid systems, no matter how loud the acoustic stimulus is made, because it is not possible for nerve impulses to be generated from sound in the normal manner. Cochlear implant systems seek to bypass these hair cells by presenting electrical stimulation to the auditory nerve fibers directly, leading to the perception of sound in the brain. There have been many ways described in the past for achieving this object, running from implantation of electrodes in the cochlea connected to the outside world via a cable and connector attached to the patient's skull, to sophisticated multichannel devices communicating with an external computer via radio frequency power and data links.
The invention described herein is particularly suited for use in a prosthesis which comprises a multichannel electrode implanted into the cochlea, connected to a multichannel implanted stimulator unit, which receives power and data from an externally powered wearable speech processor, wherein the speech processing strategy is based on known psychophysical phenomenon, and is customized to each individual patient by use of a diagnostic and programming unit. One example of such a prosthesis is the one shown and described in U.S. Pat. No. 4,532,930 to Crosby et al., entitled "Cochlear Implant System for an Auditory Prosthesis".
In order to best understand the invention it is necessary to be aware of some of the physiology and anatomy of human hearing, and to have a knowledge of the characteristics of the speech signal. In addition, since the hearing sensations elicited by electrical stimulation are different from those produced by acoustic stimulation in a normal hearing person, it is necessary to discuss the psychophysics of electrical stimulation of the auditory system. In a normal hearing person, sound impinges on the ear drum, as illustrated in FIG. 1, and is transmitted via a system of bones called the ossicles, which act as levers to provide amplification and acoustic impedance matching to a piston, or membrane, called the oval window, which is coupled to the cochlea chamber.
The cochlear chamber is about 35 mm long when unrolled and is divided along most of its length by a partition. This partition is called the basilar membrane. The lower chamber is called the scala tympani. An opening at the remote end of the cochlea chamber communicates between the upper and lower halves thereof. The cochlea is filled with a fluid having a viscosity of about twice that of water. The scala tympani is provided with another piston or membrane called the round window which serves to take up the displacement of the fluid.
When the oval window is acoustically driven via the ossicles, the basilar membrane is displaced by the movement of fluid in the cochlea. By the nature of its mechanical properties, the basilar membrane vibrates maximally at the remote end or apex of the cochlea for low frequencies, and near the base or oval window thereof for high frequencies. The displacement of the basilar membrane stimulates a collection of cells called the hair cells situated in a special structure on the basilar membrane. Movements of these hairs produce electrical discharges in fibers of the VIIIth nerve, or auditory nerve. Thus the nerve fibers from hair cells closest to the round window (the basal end of the cochlea) convey information about high frequency sound, and fibers more apical convey information about low frequency sound. This is referred to as the tonotopic organization of nerve fibers in the cochlea.
Hearing loss may be due to many causes, and is generally of two types. Conductive hearing loss occurs when the normal mechanical pathways for sound to reach the hair cells in the cochlea are impeded, for example by damage to the ossicles. Conductive hearing loss may often be helped by use of hearing aids, which amplify sound so that acoustic information does reach the cochlea. Some types of conductive hearing loss are also amenable to alleviation by surgical procedures.
Sensorineural hearing loss results from damage to the hair cells or nerve fibers in the cochlea. For this type of patient, conventional hearing aids will offer no improvement because the mechanisms for transducing sound energy into nerve impulses have been damaged. It is by directly stimulating the auditory nerve that this loss of function can be partially restored.
In the system described herein, and in some other cochlear implant systems in the prior art, the stimulating electrodes are surgically placed in the scala tympani, in close proximity to the basilar membrane, and currents that are passed between the electrodes result in neural stimulation in groups of nerve fibers.
The human speech production system consists of a number of resonant cavities, the oral and the nasal cavities, which may be excited by air passing through the glottis or vocal cords, causing them to vibrate. The rate of vibration is heard as the pitch of the speaker's voice and varies between about 100 and 400 Hz. The pitch of female speakers is generally higher than that of male speakers.
It is the pitch of the human voice which gives a sentence intonation, enabling the listener, for instance, to be able to distinguish between a statement and a question, segregate the sentences in continuous discourse and detect which parts are particularly stressed. This together with the amplitude of the signal provides the so-called prosodic information.
Speech is produced by the speaker exciting the vocal cords, and manipulating the acoustic cavities by movement of the tongue, lips and jaw to produce different sounds. Some sounds are produced with the vocal cords excited, and these are called voiced sounds. Other sounds are produced by other means, such as the passage of air between teeth and tongue, to produce unvoiced sounds. Thus the sound "Z" is a voiced sound, whereas "S" is an unvoiced sound; "B" is a voiced sound, and "P" is an unvoiced sound, etc.
The speech signal can be analyzed in several ways. One useful analysis technique is spectral analysis, whereby the speech signal is analyzed in the frequency domain, and a spectrum is considered of amplitude (and phase) versus frequency. When the cavities of the speech production system are excited, a number of spectral peaks are produced, and the frequencies and relative amplitudes of these spectral peaks are also varied with time.
The number of spectral peaks ranges between about three and five and these peaks are called "formants". These formants are numbered from the lowest frequency formant, conventionally called F1, to the highest frequency formants, and the voice pitch is conventionally referred to as F0. Characteristic sounds of different vowels are produced by the speaker changing the shape of the oral and nasal cavities, which has the effect of changing the frequencies and relative intensities of these formants.
In particular, it has been found that the second formant (F2) is important for conveying vowel information. For example, the vowel sounds "oo" and "ee" may be produced with identical voicings of the vocal cords, but will sound different due to different second formant characteristics.
There is of course a variety of different sounds in speech and their method of production is complex. For the purpose of understanding the invention herein however, it is sufficient to remember that there are two main types of sounds - voiced and unvoiced; and that the time course of the frequencies and amplitudes of the formants carries most of the intelligibility of the speech signal.
The term "psychophysics" is used herein to refer to the study of the perceptions elicited in patients by electrical stimulation of the auditory nerve. For stimulation at rates between 100 and 400 pulses per second, a noise is perceived which changes pitch with stimulation rate. This is such a distinct sensation that it is possible to convey a melody to a patient by its variation.
By stimulating the electrode at a rate proportional to voice pitch (F0), it is possible to convey prosodic information to the patient. This idea is used by some cochlear implant systems as the sole method of information transmission, and may be performed with a single electrode.
It is more important to convey formant information to the patient, as this contains most of the intelligibility of the speech signal. It has been discovered by psychophysical testing that just as an auditory signal which stimulates the remote end of the cochlea produces a low frequency sensation and a signal which stimulates the near end thereof produces a high frequency sensation, a similar phenomenon will be observed with electrical stimulation. The perceptions elicited by electrical stimulation at different positions inside the cochlea have been reported by the subjects as producing percepts which vary in "sharpness" or "dullness", rather than pitch as such. However, the difference in frequency perceptions between electrodes is such that formant, or spectral peak, information can be coded by selection of electrode, or site of stimulation in the cochlea.
It has been found by psychophysical testing that the range of electrical stimulation corresponding to loudness from threshold to uncomfortably loud (typically 12 dB) is smaller than the corresponding range of acoustic signals for normally hearing people (typically 100 dB).
It has also been discovered through psychophysical testing that the pitch of sound perceptions due to electrical stimulation is also dependent upon frequency of stimulation, but the perceived pitch is not the same as the stimulation frequency. In particular, the highest pitch able to be perceived through the mechanism of the changing stimulation rate alone is in the order of 1 kHz, and stimulation at rates above this maximum level will not produce any increase in pitch of the perceived sound. In addition, for electrical stimulation within the cochlea, the perceived pitch depends upon electrode position. In multiple electrode systems, the perceptions due to stimulation at one electrode are not independent of the perceptions due to simultaneous stimulation of nearby electrodes. Also, the perceptual qualities of pitch, "sharpness", and loudness are not independently variable with stimulation rate, electrode position, and stimulation amplitude.
Some systems of cochlear implants in the prior art are arranged to stimulate a number of electrodes simultaneously in proportion to the energy in specific frequency bands, but this is done without reference to the perceptions due to stimulus current in nearby stimulating electrodes. The result is that there is interaction between the channels and the loudness is affected by this.
A number of attempts have heretofore been made to provide useful hearing through electrical stimulation of auditory nerve fibers, using electrodes inside or adjacent to some part of the cochlear structure. Systems using a single pair of electrodes are shown in U.S. Pat. No. 3,751,605 to Michelson and U.S. Pat. No. 3,752,939 to Bartz.
In each of these systems an external speech processing unit converts the acoustic input into a signal suitable for transmission through the skin to an implanted receiver/stimulator unit. These devices apply a continuously varying stimulus to the pair of electrodes, stimulating at least part of the population of auditory nerve fibers, and thus producing a hearing sensation.
The stimulus signal generated from a given acoustic input is different for each of these systems, and while some degree of effectiveness has been demonstrated for each, performance has varied widely across systems and also for each system between patients. Because the design of these systems has evolved empirically, and has not been based on detailed psychophysical observations, it has not been possible to determine the cause of this variability. Consequently, it has not been possible to reduce it.
An alternative approach has been to utilize the tonotopic organization of the cochlea to stimulate groups of nerve fibers, depending on the frequency spectrum of the acoustic signal. Systems using this technique are shown in U.S. Pat. No. 4,207,441 to Ricard, U.S. Pat. No. 3,449,768 to Doyle, U.S. Pat. No. 4,063,048 to Kissiah, and U.S. Pat. Nos. 4,284,856 and No. 4,357,497 to Hochmair et al.
The system described by Kissiah uses a set of analog filters to separate the acoustic signal into a number of frequency components, each having a predetermined frequency range within the audio spectrum. These analog signals are converted into digital pulse signals having a pulse rate equal to the frequency of the analog signal they represent, and the digital signals are used to stimulate the portion of the auditory nerve normally carrying the information in the same frequency range. Stimulation is accomplished by placing an array of spaced electrodes inside the cochlea.
The Kissiah system utilizes electrical stimulation at rates up to the limit of normal acoustic frequency range, say 10 kHz, and independent operation of each electrode. Since the maximum rate of firing of any nerve fiber is limited by physiological mechanisms to one or two kHz, and there is little perceptual difference for electrical pulse rates above 1 Hz, it may be inappropriate to stimulate at the rates suggested. No consideration is given to the interaction between the stimulus currents generated by different electrodes, which experience shows may cause considerable uncontrolled loudness variations, depending on the relative timing of stimulus presentations. Also, this system incorporates a percutaneous connector which has with it the associated risk of infection.
The system proposed by Doyle limits the stimulation rate for any group of fibers to a rate which would allow any fiber to respond to sequential stimuli. It utilizes a plurality of transmission channels, with each channel sending a simple composite power/data signal to a bipolar pair of electrodes. Voltage source stimulation is used in a time multiplexed fashion similar to that subsequently used by Ricard and described below, and similar uncontrolled loudness variations will occur with the suggested independent stimulation of neighboring pairs of electrodes. Further, the requirement of a number of transmission links equal to the number of electrode pairs prohibits the use of this type of system for more than a few electrodes.
The system proposed by Ricard utilizes a filter bank to analyze the acoustic signal, and a single radio link to transfer both power and data to the implanted receiver/stimulator, which presents a time-multiplexed output to sets of electrodes implanted in the cochlea. Monophasic voltage stimuli are used, with one electrode at a time being connected to a voltage source while the rest are connected to a common ground line. An attempt is made to isolate stimulus currents from one another by placing small pieces of silastic inside the scala, between electrodes. Since monophasic voltage stimuli are used, and the electrodes are returned to the common reference level after presentation of each stimulus, the capacitive nature of the electrode/electrolyte interface will cause some current to flow for a few hundred microseconds after the driving voltage has been returned to zero. This will reduce the net transfer of charge (and thus electrode corrosion) but this charge recovery phase is now temporarily overlapped with the following stimulus or stimuli. Any spatial overlap of these stimuli would then cause uncontrolled loudness variations.
In the Hochmair et al. patents a plurality of carrier signals are modulated by pulses corresponding to signals in audio frequency bands. The carrier signals are transmitted to a receiver having independent channels for receiving and demodulating the transmitted signals. The detected pulses are applied to electrodes on a cochlear implant, with the electrodes selectively positioned in the cochlea to stimulate regions having a desired frequency response. The pulses have a frequency which corresponds to the frequency of signals in an audio band and a pulse width which corresponds to the amplitude of signals in the audio band.
U.S. Pat. No. 4,267,410 to Forster et al. describes a system which utilizes biphasic current stimuli of predetermined duration, providing a good temporal control of both stimulating and recovery phases. However, the use of fixed pulse duration prohibits variation of this parameter which may be required by physiological variations between patients. Further, the data transmission system described in this system severely limits the number of pulse rates available for constant rate stimulation.
U.S. Pat. No. 4,593,696 to Hochmair et al. describes a system in which at least one analog electrical signal is applied to implanted electrodes in a patient, and at least one pulsatile signal is applied to implanted electrodes. The analog signal represents a speech signal, and the pulsatile signal provides specific speech features such as formant frequency and pitch frequency.
U.S. Pat. No. 4,515,158 to Patrick et al. describes a system in which sets of electrical currents are applied to selected electrodes in an implanted electrode array. An incoming speech signal is processed to generate an electrical input corresponding to the received speech signal, and electrical signals characterizing acoustic features of the speech signal are generated from the input signal. Programmable means obtains and stores data from the electrical signals and establishes sets of electric stimuli to be applied to the electrode array, and instruction signals are produced for controlling the sequential application of pulse stimuli to the electrodes at a rate derived from the voicing frequency of the speech signal for voiced utterances and at an independent rate for unvoiced utterances.
The state of the art over which the present invention represents an improvement is perhaps best exemplified by the aforesaid U.S. Pat. No. 4,532,930 to Crosby et al., entitled "Cochlear Implant System for an Auditory Prosthesis". The subject matter of said Crosby et al. patent is hereby incorporated herein by reference. The Crosby et al. patent describes a cochlear implant system in which an electrode array comprising multiple platinum ring electrodes in a silastic carrier is implanted in the cochlea of the ear. The electrode array is connected to a multi-channel receiver-stimulating unit, containing a semiconductor integrated circuit and other components, which is implanted in the patient adjacent the ear. The receiver-stimulator unit receives data information and power through a tuned coil via an inductive link with a patient-wearable external speech processor. The speech processor includes an integrated circuit and various components which are configured or mapped to emit data signals from an Erasable Programmable Read Only Memory (EPROM). The EPROM is programmed to suit each patient's electrical stimulation perceptions, which are determined through testing of the patient and his implanted stimulator/electrode. The testing is performed using a diagnostic and programming unit (DPU) that is connected to the speech processor by an interface unit.
The Crosby et al. system allows use of various speech processing strategies, including dominant spectral peak and voice pitch, so as to include voiced sounds, unvoiced glottal sounds and prosodic information. The speech processing strategy employed is based on known psychophysical phenomena, and is customized to each individual patient by the use of the diagnostic and programming unit. Biphasic pulses are supplied to various combinations of the electrodes by a switch controlled current sink in various modes of operation. Transmission of data is by a series of discrete data bursts which represent the chosen electrode(s), the electrode mode configuration, the stimulating current, and biphasic pulse duration.
Each patient will have different perceptions resulting from electrical stimulation of the cochlea. In particular, the strength of stimulation required to elicit auditory perceptions of the same loudness may be different from patient to patient, and from electrode to electrode for the same patient. Patients also may differ in their abilities to perceive pitch changes from electrode to electrode.
The speech processor accommodates differences in psychophysical perceptions between patients and compensates for the differences between electrodes in the same patient. Taking into account each individual's psychophysical responses, the speech processor encodes acoustic information with respect to stimulation levels, electrode frequency boundaries, and other parameters that will evoke appropriate auditory perceptions. The psychophysical information used to determine such stimulation parameters from acoustic signals is referred to as a MAP and is stored in a random access memory (RAM) inside the speech processor. An audiologist generates and "fine tunes" each patient's MAP using a diagnostic and programming system (DPS). The DPS is used to administer appropriate tests, present controlled stimuli, and confirm and record test results.
The multi-electrode cochlear prosthesis has been used successfully by profoundly deaf patients for a number of years and is a part of everyday life for many people in various countries around the world. The implanted part of the prosthesis has remained relatively unchanged except for design changes, such as those made to reduce the overall thickness of the device and to incorporate an implanted magnet to eliminate the need for wire headsets.
The external speech processor has undergone significant changes since early versions of the prosthesis. The speech coding scheme used by early patients presented three acoustic features of speech to implant users. These were amplitude, presented as current level of electrical stimulation; fundamental frequency or voice pitch, presented as rate of pulsatile stimulation; and the second formant frequency, represented by the position of the stimulating electrode pair. This coding scheme (F0F2) provided enough information for profoundly postlinguistically deafened adults to show substantial improvements in their perception of speech.
The early coding scheme progressed naturally to a later coding scheme in which additional spectral information is presented. In this scheme a second stimulating electrode pair was added, representing the first formant of speech. The new scheme (F0F1F2) showed improved performance for adult patients in all areas of speech perception.
Despite success of speech processors using the F0F1F2 scheme over the last few years, a number of problems have been identified. For example, patients who perform well in quiet conditions can have significant problems when there is a moderate level of background noise. Also, the F0F1F2 scheme codes frequencies up to about 3500 Hz; however, many phonemes and environmental sounds have a high proportion of their energy above this range making them inaudible to the implant user in some cases.
It is, therefore, a primary object of the present invention to provide an improved cochlear implant system which overcomes various of the problems associated with earlier cochlear implant systems.
Another object of the invention is to provide, in a cochlear implant system, an improved speech coding scheme in which all of the information available in earlier coding schemes is retained and additional information from additional high frequency band pass filters is provided.
Further objects or advantages of this invention will become apparent as the following description proceeds.