(1) Field of the Invention
The present invention relates to the field of sound-producing prosthetic devices for use by laryngectomized patients and more particularly to the reproduction by electronic-type artificial larynxes of natural sounding voice tones.
(2) Description of the Prior Art
It is frequently necessary to remove the larynx, or so-called voice box, by surgical procedures due to malignant growths in the larynx itself or adjacent tissues. With removal of the larynx and its component vocal cords, access of the trachea to air is necessary and a hole, or stoma, is, therefore, formed in the neck and the trachea is sewn directly to the tissues surrounding such stoma. Thereafter, the patient breaths through the stoma bypassing the upper respiratory track, including the mouth. Since the larynx normally reproduces voice tones or sounds by vibration of its membranes or vocal cords as air passes between these membranes while they are held in a tensed condition, a laryngectomy patient has no further ability to produce natural voice tones. In addition, no air can pass from the larynx through the mouth to be formed into various articulated speech tones, or even whispered speech, which does not normally require the use of a voice tone. Consequently, no speech whatsoever is possible for a laryngectomy patient, except so-called esophageal speech, which relies upon air passed from the stomach in a sort of belch after swallowing air. Such esophageal derived air can be formed into words, lacking, however, the normal voice tone, although vibration of the esophagus does produce a somewhat different tone, rather akin to belching. Such esophageal tone can be articulated into words by the lips, tongue and teeth. Esophageal speech is difficult and time consuming, both to learn and to practice, as well as difficult for others to understand without considerable experience and interpretation. Consequently, various artificial tone-producing devices have been developed in the past to provide a voice tone for articulation by the lips, tongue and teeth into speech either prior to a patient being able to learn esophageal speech or as a complete substitute for esophageal speech in public communication.
A large number of devices for producing artificial voice tones which can be shaped by a post laryngectomy patient into recognizable speech have been produced in the past. Such devices are generally referred to as artificial larynxes. For example, one of the very early artificial larynxes is disclosed in U.S. Pat. No. 1,901,433 issued Mar. 14, 1933 to G. W. Burchett. The artificial voice tone of Burchett was produced by a hand-held electrical tone generator which conducted the tone through a tube into the mouth of the laryngectomy patient where it could by movement of the mouth structures, be formed into at least somewhat intelligible voice sounds. The pitch of the sound could be controlled by means of a button on the tone generator. The Burchett device, like many later devices, suffered from poor reproducibility of voice sounds and was objectionable because of its high visibility, which visibility distracted others from trying to understand what was being said and embarrassed the user. Several more sophisticated devices of the same nature were developed by the Bell Telephone Laboratories and disclosed in U.S. Pat. Nos. 2,041,487 and 2,056,295 in 1936 and U.S. Pat. No. 2,202,467 in 1940 to Riesz.
In 1942 the use of a receiver held in the mouth to produce a potential voice tone was disclosed by G. M. Wright in U.S. Pat. No. 2,273,077. The receiver was connected to an external amplifier by an electrical wire. The Wright device also could use a tone transmitting tube and was used primarily for reproducing unusual voice tones, for example, in broadcasting and motion pictures, by persons having normal speech apparatus.
In 1942 G. M. Wright also patented one of the first artificial larynxes to apply voice tones against the throat in U.S. Pat. No. 2,273,078. Since that time, the Wright device and adaptations thereof have become one of the prime artificial larynxes in use. The Wright device included high and low frequency transducers to apply tones through the skin. Many of these devices have been hand held, as shown in U.S. Pat. No. 3,072,745 to H. L. Barny assigned to Bell Telephone Laboratories. Barney controlled the pitch of the tone produced by use of a rheostat over a spectrum said to approach natural speech tones.
In 1958, H. K. Cooper patented the use of a dental prosthesis or denture to contain an emitter or speaker, as shown in U.S. Pat. No. 2,862,209. Cooper formed artificial voice tones in his emitter at one side of the denture and conducted such tones through one or more resonating passages in the denture, at least one of which opened to a wider resonant chamber. It was disclosed by Cooper that higher frequency sound would tend to follow the narrower passage while lower frequencies would tend to follow the wider passage or resonating chamber. Conductors extended from the denture in the mouth to an energy source or controller carried in the user's pocket. The use of an electroacoustical transducer mounted in a denture with energy to operate such transducer being supplied through a wire leading into the mouth was also discussed by Barney, mentioned above, in his application filed in 1959 and issued as a patent in 1963.
Also in 1963, the use of a mouth controlled switch for an artificial larynx was disclosed in U.S. Pat. No. 3,084,221 to H. K. Cooper et al. This was followed in 1970 by U.S. Pat. No. 3,508,000 to C. M. Snyder who housed pressure transducers and the like in the mouth in artificial tooth structures.
U.S. Pat. No. 3,524,932, issued also in 1970 to F. F. Stucki and assigned to Lockheed Aircraft, disclosed the use of a number of transducers in the mouth for activation by the mouth structures during speech movements of the mouth parts. Small transmitters may be used to conduct these signals from the mouth to a receiver.
The use of rather sophisticated pitch and wave forms having the characteristics of a damped sinusoid closely approximating, it is said, the wave forms of normal speech, including numerous harmonics, is disclosed in U.S. Pat. No. 3,914,550 issued Oct. 21, 1975 to G. I. Cardwell.
U.S. Pat. No. 4,473,905 issued in 1984 to Katz et al. as well as several subsequent patents also assigned to Thomas Jefferson University disclose the use of sophisticated self-contained intraoral artificial larynxes or larynges comprising a power source, on-off controls, low power circuitry with acoustic and electrical amplifiers and small loud speakers. Switches for control of the voice tones provided by the devices are controlled by the tongue of the user. A similar device is disclosed by U.S. Pat. No. 4,706,292 issued in 1987 to W. L. Torgeson. Both these devices can produce sound which, after passing through the vocal tract of the user, is understandable and significantly better than possible with previously available devices. While both devices, therefore, are improvements over the previous state-of-the-art, both suffer from the fact that the speech sound produced, while largely understandable, does not have a natural sound and is therefore, frequently both an embarrassment to the speaker as well as to the one spoken to.
Finally, U.S. Pat. No. 4,571,739 issued in 1986 to J. A. Resnick discloses the provision of an artificial larynx in which the individual self-contained elements are contained not in the base of a denture, but in artificial teeth attached to such denture thus allegedly providing additional room permitting larger elements and better sound reproduction or articulation. Switches for the unit are positioned upon the backs of preferably the front teeth on the unit or those of the user him or herself. The speech sounds produced, however, still leave much to be desired.
There has been and continues to be, therefore, a need for an unobtrusive artificial larynx which can produce more natural sounding speech tones, the use of which can be learned quickly and easily and which can produce reproducible speech sounds more reliably and easily.
The sound of the human voice starts with a so-called glottal pulse formed as the larynx opens and closes releasing a puff of air. This puff of air can be described in physical terms as the number of cubic centimeters of air passing through the open larynx as a function of time, or the volume velocity as a function of time. Various Fourier components of the volume velocity contribute to the resulting sound wave which is eventually radiated from the mouth and nose. The glottal pulse sound wave is shaped by the vocal tract formed by the mouth, tongue, lips, teeth and nasal tract. This process is described in the book Speech Analysis and Perception, Flanagan, J. L. (1972) (Springer, New York) and more recently, in the article by Dennis and Laura Klatt entitled "Analysis, synthesis and perception of voice quality variations among female and male talkers" which appeared in the "Journal of the Acoustical Society of America" in February 1990, pages 820-857. To create natural sounding speech, therefore, an artificial larynx should produce a glottal pulse which duplicates the elements of the natural glottal pulse which contribute to the formation of audible sound.
The second element which contributes to natural sounding speech is the ability of the speaker to start and stop the glottal pulsing under his own control. The control of starting and stopping of sounds is as important to intelligible speech as is the continuation of the sounds. This starting and stopping of speech can occur at rates as fast as five times per second or 200 milliseconds per start/stop cycle. Therefore, the second requirement in producing natural speech sounds is to provide a means whereby the user of the artificial larynx can start and stop the sounds up to five times per second.
In addition to being able to initiate vibration of the larynx to form a voice tone, a natural speaker is able to alter the constriction of or the tension in the larynx to change the frequency of the basic repetition of the glottal pulse. The vibratory opening and closing can range for a human male from 75 to 250 times per second in normal speech. Typical male repetition rates are about 125 times per second. Human females typically have a normal frequency almost twice as high. If one takes singing into account, the range of vibratory frequencies is even wider. Therefore, the third requirement for producing natural speech is to provide a means by which the user of the artificial larynx can change the frequency of vibration in a range typically between 75 and 250 cycles per second for a male and from 150 to 500 cycles per second for a female.
A third element of natural speech is the inclusion of the fricative sounds caused by air rushing past a closing or almost closed constriction in the vocal tract. For example, the "s" sound is produced by placing the tongue behind the teeth and forcing air past the closure producing high frequency random noise. The "f" sound is produced by forcing air past a constriction formed by the lips. It has been observed that although laryngectomees do not breath through their mouths, they can collect air in the mouth and throat cavity and expel it by muscle contraction. This limited source of air flow is sufficient for forming fricative sounds in speech and has not been included in the current model of the artificial larynx. Further studies are under way to see if the inclusion of an artificial fricative makes learning to use the larynx easier or more difficult.
A fourth element of natural speech is the ability to change loudness and create emphasis on various tones. Changing the loudness of the speech sound realistically requires changing the frequency structure of the sound as well as increasing the absolute level of the original sound, i.e., it is more involved than merely linearly changing the sound level as would occur if one turned up the volume control knob on a radio. Accordingly, it would be desirable for an artificial larynx to incorporate a means for rapidly and independently changing the loudness and tonal quality of the sound in a realistic sounding manner.
A fifth element of natural speech is the ability to change tonal quality. Tonal quality is related to the ability to differentiate between the harmonic content of two sounds produced at the same fundamental frequency. Some speakers sound mellow and others sound shrill even though the fundamental frequency of the tones is the same. This difference in sound is due to differences in the harmonic content of the tones.
A sixth element of natural sounding speech is the inclusion of random noise in the higher frequency regions of speech sound production. This aspiration noise is different from the formation of fricatives in that the constriction is formed at the larynx itself and not a separate place in the vocal tract, and it accompanies normal vowel production. Accordingly, it would be desirable for an artificial larynx to include means for producing random noise such as aspiration noise.
A seventh element of natural speech is the slight random variation of the basic repetition rate of the fundamental tones. Studies have indicated that pseudorandom variation of tones as much as 0.5% of frequency accompanies natural speech and the inclusion of this random frequency variation reduces the "mechanical" nature of speech produced by an artificial larynx.
An eighth element of natural speech is the occasional absence of pulses as a speaker talks at lower volume or lower frequency, typically at the ending of phrases or sentences. Often every second pulse is absent or reduced in amplitude. This effect is called diplophonia. A natural sounding voice, therefore, should include a means of duplicating, under user control, some diplophonic sounds.
A ninth element of importance in human speech is the ability to whisper. Although this mode of speaking is different in many respects from normal speech, it still presents an important mode of speech which can assume some importance to the speaker.
Although the foregoing elements of natural speech have been long understood, none of the above cited prior references address these important issues As indicated above, therefore, there has been a need for an artificial larynx that will provide more natural sounding speech. More particularly, there has been a need for an artificial larynx that both addresses and solves all or at least most of these problems in a commercially acceptable manner which allows the creation of an artificial larynx capable of producing natural speech for the first time.