This invention relates to an acoustic based method of identifying and rating motor speech deterioration and the underlying pathologies of the deterioration.
Ataxia is a profound loss of muscular coordination which characterizes cerebellar-based pathology (Diener and Dichgans, 1992). A loss of coordination in ataxia may take the form of the loss of balance, the inability to walk heel to toe, nystagmus, difficulty alternating sequences of movements, dysarthria (Diener, and Dichgans, 1992), dysmetria and hypermetria (Hallet, Shahani, and Young, 1975 a, b).
Kinematic/electromyographic (EMG) studies of speech have investigated both normal muscle activation patterns (Harris, 1978; Tuller, Harris, and Kelso, 1981) and muscle patterns in ataxic dysarthric speakers (Ackermann, Hertrich, Daum, Scharf, and Spieker, 1997; (Ackermannn, Hertrich, and Scharf, 1995). Several investigations have also focused on the possible similarities or differences between limb movements and speech movements in ataxia. (Gentil, Devanne, Maton, and Brice, 1992; Salisachs, 1979; Akermann et al., 1995; Ostry, Keller, and Parush, 1983).
Early descriptions of dysarthria (Darley, Aronson and Brown, 1969a., b) were based solely on the perceptual judgments of several speech and voice characteristics. The 10 speech and voice characteristics of ataxic dysarthria (Darley et al., 1969a), typified by poor coordination of the articulators, were grouped into three clusters: 1. articulatory inaccuracy (imprecise consonants, irregular articulatory breakdown, and distorted vowels); 2. prosodic excess (excess and equal stress, prolonged phonemes, prolonged intervals, and slow rate); and 3. phonatory-prosodic insufficiency (monopitch, monoloudness, and harsh voice). However, perceptual descriptions are misleading and the distinctive patterns once claimed, do not hold for comparisons of neurological syndromes (Ziegler) or for those patterns found in ataxic speakers.
Subsequent studies attempted to supplement the perceptual description of ataxic dysarthria with acoustic analyses (see e.g., Kent, Netsell, and Abbs, 1979), however, they did not describe the prosodic descriptors in acoustic terms nor correlated their analyses with neurological pathological conditions.
Acoustic descriptions of ataxic speech were used to measure duration and first and second formant onset/offset frequencies. In studies using ataxic subjects, Kent et al. used acoustic measurement to describe ataxic speech and concluded that that formant frequencies were normal in ataxic subjects while duration measures were not. However the speaker tasks were many and did not capture the disturbance of syllables in several contexts. Thus whether this finding held for words within a phrase, or for syllables or words within sentences was not known. Further, although the acoustic analysis of dysarthria caused by cerebellar damage found a disproportionate lengthening of the segment to be a fundamental property of ataxic dysarthria (Kent et al., 1979), it was not known whether this was a property specific to ataxic dysarthria, or was present in all or some other dysarthrias. Examining narrow band spectrograms of sentences led Kent et al. to suggest a syllable-level planning with a falling f0 on each successive syllable. The lengthening of segments and syllables led the investigators to posit a disordered prosody for cerebellar subjects.
Prosody in normal speech production has included descriptions of F0, formant frequencies and syllable duration. These acoustic descriptions have not described the dissociation between time (duration) and space (oral pharyngeal space) inferred from formant frequency values F1 and F2. This dissociation is important because it allows for description of pathological utterances that are long and reduced in movements (slurred speech) and the slowing down of normal speech that occurs at the end of utterances. Previously models of speech predicted that all lengthened syllables would have more extreme movements.
A critical issue in the study of speech motor control is the identification of the mechanisms that generate the temporal flow of serially ordered articulatory events. Early investigations were motivated by Lashley""s (1951) model that predicted a monotonic relationship between vowel duration and formant frequency. Lindblom (1963), for instance, claimed that articulatory xe2x80x9cundershootxe2x80x9d is the basis for any reduction in vowel duration from normal values.
Subsequent studies did not find that duration and formant frequency were monotonically linked. Harris (1978) found that, contrary to the predictions of Lindblom""s model, when either rate or stress was manipulated, syllable duration and vowel formant frequency varied independently in a non-monotonic relation. In addition, EMG studies showed reduced orbicularis oris and genioglossus activity for syllables of reduced stress (Tuller et al., 1981; Harris, 1971,1978). The conclusions drawn from these physiological and acoustic data of normal speakers was that any change in rate or stress may result in independent variations of syllable duration and formant values.
The components of prosody have been defined as the acoustic features of f0, segment duration, amplitude and segmental quality. Variations in the values of these features signal, among other things, constituent boundaries and syllable prominence. The kinematic data of Cohen et al. (1995) for six different conditions of syllable prominence show a difference in velocity for accented syllables in phrase final vs. non-final position. This kinematic finding speaks to the non-monotonic relation between duration and formant frequencies: all durations are not the same. Sometimes speakers slow down, resulting in reduced vowel space with longer durations; thus there is a dissociation between duration and formant frequency. However, this was not shown in acoustic measures.
It is an object of the invention to provide a reliable quantitative acoustic assessment to describe the speech and voice characteristics of subjects with neuro-motor speech disturbances. It is a further object of the invention to provide a method that can correlate prosodic descriptors in acoustic terms with neurological pathological conditions.
The present invention provides a method of identifying speech motor dysfunction in a test subject comprising measuring one or more acoustic parameters of one or more prosodic conditions; comparing each acoustic parameter between pairs of prosodic conditions to obtain a contrast value; and comparing the contrast values for each acoustic parameter to contrast values of a normal subject, wherein a difference in contrast values between the test subject and the normal subject is correlated to speech motor dysfunction.
The present invention also provides a method of identifying speech deterioration in a test subject comprising measuring one or more acoustic parameters of one or more prosodic conditions; comparing each acoustic parameter between pairs of prosodic conditions to obtain a contrast value; and comparing the contrast values for each acoustic parameter to contrast values of a normal subject, wherein a difference in contrast values between the test subject and the normal subject is correlated to speech deterioration.
The present invention further provides a method of diagnosing speech motor dysfunction in a test subject comprising measuring one or more acoustic parameters of one or more prosodic conditions; comparing each acoustic parameter between pairs of prosodic conditions to obtain a contrast value; and comparing the contrast values for each acoustic parameter to contrast values of a normal subject, wherein a difference in contrast values between the test subject and the normal subject is correlated to speech motor dysfunction.
In another embodiment, the present invention provides a method of rating the severity of speech motor dysfunction in a test subject comprising measuring one or more acoustic parameters of one or more prosodic conditions; comparing each acoustic parameter between pairs of prosodic conditions to obtain a contrast value; and comparing the contrast values for each acoustic parameter to contrast values of a normal subject, wherein a difference in contrast values between the test subject and the normal subject is correlated to a rating of the severity of the speech motor dysfunction.
The acoustic parameters comprise syllable duration, f0, F1 and F2. The prosodic conditions comprise (1) phrase-final accented (+pf+a), (2) non-phrase-final accented (xe2x88x92pf+a), (3) non-phrase-final unaccented (xe2x88x92pfxe2x88x92a), (4) nuclear accented (+n+a), (5) post nuclear unaccented (xe2x88x92nxe2x88x92a) and (6) reduced vowel (red).
According to the methods of the invention, the contrast values are compared using the equation
(Test: (+pf+a)xe2x88x92(xe2x88x92pf+a)xe2x88x92Control (+pf+a)xe2x88x92(xe2x88x92pfxe2x88x92a)2+(Test: (xe2x88x92pf+a)xe2x88x92(xe2x88x92pfxe2x88x92a)xe2x88x92Control: (xe2x88x92pf+a)xe2x88x92(xe2x88x92pf+a)2+(Test: (+n+a)xe2x88x92(xe2x88x92nxe2x88x92a)xe2x88x92Control (+n+a)xe2x88x92(xe2x88x92nxe2x88x92a)2+(Test: (xe2x88x92nxe2x88x92a)xe2x88x92(red)xe2x88x92Control: (xe2x88x92nxe2x88x92a)xe2x88x92(red)2 
wherein
The present invention provides a means of identifying abnormal speech and voice patterns based on a comparison to a normal model of speech and voice patterns. The present invention further provides an accurate and sensitive acoustically-based method of identifying system of deterioration of motor function. The method may be useful for the screening and diagnosis of cerebellar-based pathological conditions or other conditions in which both speech/motor function is deteriorated or impaired.