1. Field of the Invention
The invention relates to a method and device for the diagnosis and treatment of speech disorders and more particularly to the dynamic measurement of the functioning of the velum in the control of nasality during speech.
2. Description of the Related Technology
A. Velar control and oronasal valving in speech.
During speech or singing, it is necessary to open and close the passageway connecting the oral pharynx with the nasal pharynx, depending on the specific speech sounds to be produced. This is accomplished by lowering and raising, respectively, the soft palate, or velum. Raising the velum puts it in contact with the posterior pharyngeal wall, to close the opening to the posterior nasal airflow system.
This oronasal (or velopharyngeal, as it is usually referred to in medical literature) passageway must be opened when producing nasal consonants, such as /m/or /n/ in English, and is generally closed when producing consonants that require a pressure buildup in the oral cavity, such as /p/, /b/ or /s/. During vowels and sonorant consonants (such as /l/ or /r/ in English), the oronasal passageway must be closed or almost closed for a clear sound to be produced, though in some languages an appreciable oronasal opening during a vowel is occasionally required for proper pronunciation. The first vowel in the words “francais” or “manger” in French are examples of such nasalized vowels. In addition, vowels adjoining a nasal consonant are most often produced with some degree of nasality during at least part of the vowel, especially if the vowel is between two nasal consonants (such as the vowel in “man” in English).
There are many disorders that result in inappropriate oronasal valving, usually in the form of a failure to sufficiently close the oronasal passageway during non-nasal consonants or non-nasalized vowels. Such disorders include cleft palate and repairs of a cleft palate, hearing loss sufficient to make the nasality of a vowel not perceptible, and many neurological and developmental disorders. The effect on speech production of insufficient oronasal closure is usually separated into the ‘nasal emission’ effect that limits oral pressure buildup in those speech sounds requiring an appreciable oral pressure buildup (as /p/, /b/, /s/ or /z/) and the perceived acoustic spectral change that can be caused in vowels and sonorant consonants and is often referred to as ‘nasalization’. (See Ronald J. Baken, Ph.D., Velopharyngeal Function, in Clinical Measurement of Speech and Voice, 393 et seq. (Little Brown & Co.—College Hill Press, 1987)). The terminology used here is that suggested by Baken, supra, who also prefers to reserve the term ‘nasality’ for the resulting perceived quality of the voice.
Since the action of the velum is not easily observed and the acoustic effects of improper velar action is sometimes difficult to monitor auditorially, there is a need in the field of speech pathology for convenient and reliable systems to monitor velar action during speech, both to give the clinician a measure of such action and to provide a means of feedback for the person trying to improve velar control.
B. Previous methods for measuring velar function
Previous methods are extensively reviewed by Baken, supra (Chapter 10). The less invasive methods described by Baken, supra, generally fall under the following four method categories:    1. Measuring the low frequency, primarily subsonic components of the airflow through the nose or through the nose and mouth simultaneously, often with a measure of the intraoral pressure. (Baken, supra, at 416-421; Calum Conner McLean, et al., An instrument for the non-invasive objective assessment of velar function during speech, Med. Eng. Phys. Vol. 19, No. 1, pp. 7-14,1997).    2. Placing an accelerometer (vibration detector) on the nose to detect sound passing through the nose. (Baken, supra, at 404-407)    3. Measuring the sound (acoustic pressure waveform) emitted from the nose and mouth, respectively, usually in conjunction with the placing of a solid sound barrier against the upper lip to improve the separation of the nasal and oral sounds, with microphones placed above and below the barrier, respectively. (Baken, supra, at 401-404; Kay Elemetrics Corp. Nasometer literature).    4. Analyzing the acoustic properties of the radiated speech to detect the acoustic properties associated with nasalization. (Baken, supra, at 398-401)
The various methods according to the present art can generally be also divided into two categories, according to the aspect of nasality being measured: (a) those that measure velar control during those consonants requiring an oral pressure buildup (as /p/, /b/, /s/ and /z/ in English), and (b) those that measure velar control during vowels and sonorant consonants. (Consonants requiring an oral pressure buildup can be further subdivided into unvoiced (as /p/ and /s/), and voiced (as /b/ or /z/). Vowels and sonorant consonants, on the other hand, are almost always voiced in non-whispered speech.) Methods in category (b), namely for measuring the nasalization of vowels and sonorant consonants, have been more difficult to implement successfully (Baken, supra, at 393).
Each of the four method categories described above has one or more serious drawbacks.    1. Methods measuring low frequency volume airflow can show well the oronasal valving patterns during voiced or unvoiced consonants requiring a strong oral pressure buildup (category (a)). However, because these methods rely on low frequency airflow components, during vowels and sonorant consonants they yield readings contaminated with significant low frequency artifacts due to lip and jaw motion and soft palate deflection. These methods also require a well-fitting mask over both nose and mouth or nasal plugs and an oral mask. The mask used can also cause a muffling of the voice (McLean, supra), though such muffling can be greatly reduced by use of a circumferentially vented mask (see below), or by using a mask incorporating one or more acoustically transparent diaphragms in the mask walls to allow the higher frequency components in speech to be more effectively radiated and also reduce deleterious acoustic loading of the vocal tract caused by the mask. Such a mask is described in U.S. Pat. No. 5,454,375. The principles of the circumferentially vented mask and the diaphragm mask can also be combined for minimal voice muffling in low frequency airflow measurements.
The other method categories focus on measurements of voiced sounds:    2. Accelerometer methods generally require adhering a small accelerometer or vibration detector to the side of the nose, and yield a measurement that is highly dependent on the vowel being spoken, the voice pitch, nose geography and the consistent placement of the accelerometer.    3. The oral/nasal sound pressure ratio methods are highly dependent on the precise geometry of the oral-nasal sound barrier used, the placement and directivity characteristics of the microphones, and the frequency range over which energy in each channel is measured. The choice of frequency range is especially problematic, since the spectral distribution in the oral and nasal channels can differ greatly, with the sound emitted from the nose consisting primarily of energy at the lower voice harmonics. Thus if too wide a bandwidth is used, such a system would be comparing the energy in mostly lower frequency voice harmonics emanating from the nose with the energy of mostly higher frequency harmonics from the mouth. For a popular commercial version of this method, the Nasometer, and its previous research version, TONAR II, this frequency range has been empirically chosen to be approximately 300 to 800 Hz (Baken, supra), presumably to both capture some of the nasal energy, which is limited to lower frequencies, and to capture the energy of the first or lowest vocal tract resonance (the first formant) for most vowels and sonorant consonants. However, since the directivity of even a directional microphone at the lower frequencies of this range is limited by the long wavelengths (approximately 3.3 feet at 300 Hz), there is necessarily some appreciable sound crossover between the oral and nasal channels (assuming reasonable proportions for the sound barrier against the upper lip). Because of the inclusion of the first formant energy in the oral signal, there is a dependence in this method on the vowel or consonant being spoken. There is also a dependence on the voice pitch, since the filter range chosen includes the strong fundamental frequency component for some values of voice pitch but not for others.    4. In the fourth class of methods, the spectrum of the radiated pressure waveform during voiced speech is analyzed to determine the degree of nasalization. However, in attempts to do this it has been difficult to obtain meaningful quantitative results (Baken, supra). The effect of incomplete velopharyngeal closure on the spectrum of a voiced speech sound is highly variable between speech sounds and is highly dependent on the acoustic properties of the nasal passages. For example, consider the great changes in the acoustic quality of a spoken vowel produced when the nasal passages are partially occluded by nasal congestion during a cold. Thus readings for the same level of velar control could vary greatly from day-to-day, even for the same subject.