1. Field of the Invention
The present invention relates generally to the field of esophageal speech, and more particularly, to a method for enhancing the clarity of esophageal speech.
2. Description of Related Art
Persons who have had laryngectomies have several options for the restoration of speech, none of which have proven to be completely satisfactory. One relatively successful method, esophageal speech, requires speakers to insufflate, or inject air into the esophagus. This method is discussed in the article "Similarities Between Glossopharyngeal Breathing And Injection Methods of Air Intake for Esophageal Speech," Weinberg, B. & Bosna, J. F., J. Speech Hear Disord, 35: 25-32, 1970, herein incorporated by reference. Esophageal speech is frequently accompanied by an undesired audible injection noise, sometimes referred to as an "injection gulp." The undesirable effect of the injection gulp is magnified because esophageal speakers generally have low vocal intensity and therefore require some form of external amplification. A further discussion of these effects may be found in the article "A Comparative Acoustic Study of Normal, Esophageal, and Tracheoespphageal Speech Production," Robbins, J., Fisher, H. B., Blom, E. C., and Singer, M. I., J. Speech Hear Res, 49: 202-210, 1984, herein incorporated by reference. The audible injection noise is undesirable for at least two reasons. First, listeners and speakers find the noise objectionable. Also, in some speakers the injection noise can be mistaken for a speech segment which diminishes the intelligibility of the speaker's voice.
Considerable work has been undertaken to enhance certain aspects of esophageal speech. Examples of these techniques are discussed in "Replacing Tracheoesophageal Voicing Sources Using LPC Synthesis," Qi, Y., J. Acoust. Soc. Am., 88: 1228-1235, and in "Enhancement of Female Esophageal and Tracheoesophageal Speech," Qi, Y., Weinberg, B. and Bi, N., J. Acoust. Soc. Am., 98: 2461-2465, both herein incorporated by reference. Although considerable work has been done in improving esophageal speech, the problem of eliminating injection noise has not been successfully addressed by the above-mentioned prior art.
One solution is disclosed by U.S. patent application Ser. No. 08/773,638, filed Dec. 23, 1996, entitled "ENHANCEMENT OF ESOPHAGEAL SPEECH BY INJECTION NOISE REJECTION." This application is commonly assigned to the assignee of the present invention. This application discloses a method of eliminating the undesirable auditory effects associated with esophageal speech. Injection noise and silence are detected in an input speech signal, and an external amplifier is switched on or off, based on the detected injection noise or silence. The input speech signal is digitized and a first copy of the digitized signal is preemphasized. After the input speech signal is preemphasized, a predetermined number of Mel-frequency cepstral coefficients (MFCCs) and difference cepstra are calculated for each window of the speech signal. A measure of signal energy and a measure of the rate of change of the signal energy is computed.
A second copy of the digitized input speech signal is processed using amplitude summation or by differencing a center-clipped signal. The measures of signal energy, rate of change of the signal energy, the Mel coefficients, difference cepstra, and either the amplitude summation value or the differenced value are combined to form an observation vector. Hidden Markov Model (HMM) based decoding is used on the observation vector to detect the occurrence of injection noise or silence. A gain switch on an external speech amplifier is turned on after an occurrence of injection noise and remains on for the duration of speech and the amplifier is turned off when an occurrence of silence is detected.
The present invention is an improved and unique method for detecting injection noise and silence in esophageal speech, and amplifying only the desired speech.