This invention relates to the modification of audio sounds, in particular, speech sounds, to enable individuals who are impaired with speech and language-based learning disabilities (L/LDs) due to a temporal processing problem to improve their speech reception, speech production, language comprehension and reading abilities. In addition, it includes training methods to help individuals with speech and language-based learning disabilities to overcome their temporal processing deficits so that they can recognize basic speech elements and normal connected speech with higher accuracy and greater intelligibility. In addition, it includes a training method to help normal individuals in the improvement of their speech reception capabilities, for either their native language, or for foreign language training.
Recent studies have shown that specific language impaired (SLI) and specifically reading impaired (dyslexic) individuals have an inability to recognize and distinguish between certain consonants and consonant-vowel combinations in natural speech. They also have difficulties in understanding written speech that appears to result from their problems in understanding aural speech. This difficulty with aural speech perception results in a delayed and usually defective development of reading skills. Studies have shown that these problems in speech reception and reading acquisition are not the result of peripheral hearing or visual deficits, but rather are due to an inability of the receptive and cognitive powers of the brain to correctly identify the rapidly changing components of speech. For example, L/LD individuals have difficulty correctly identifying the rather short consonant sounds (a few tens of milliseconds long) or to reliably separate them from associated longer vowel sounds. Consequently the individuals are unable to generate a reliable representation of the fundamental phonetic elements of the native language in their brains. The result is that the impaired individual not only has difficulty correctly identifying the unique sounds of spoken words and strings of connected speech, but also often has associated difficulties in learning to accurately articulate speech. In addition, the impaired individual may have limited cognitive abilities that rely on accurately recognizing words and long speech strings, and limited abilities in cognitively associating written speech with their brain's poor representations of aural speech.
In particular, consonant sounds generally have a frequency modulated component such that the sound frequency may rise or fall, or be interrupted by pauses that last for less than 25 milliseconds to more than 80 milliseconds. This rising or falling sound frequency or brief interruptions of the consonant sounds are followed or preceded by a vowel sound, which has a relatively constant or more slowly changing spectral content, and which usually extends over a period of from many tens of milliseconds up to several hundred milliseconds. The majority of individuals with L/LDs (dysphasia or dyslexia) cannot distinguish between the consonant-vowel combinations (for example, /ba/ and /da/, or /ab/ and /ad/) when the frequency modulated components of the consonants /b/ and /d/ are of normal duration (for example, less than 60 to 80 milliseconds long).
The basic temporal processing deficit in L/LDs is also reliably demonstrated by testing a dysphasic and/or dyslexic individual's ability to identify sounds that are presented in rapid succession, as commonly occurs for successive phonetic elements in normal speech reception. For example, an L/LD child or adult commonly cannot correctly identify the order of presentation of two different, successive vowel-like stimuli that are each 50 milliseconds in duration unless they are separated in time by more than 100 milliseconds, and often by more than several hundred milliseconds. By contrast, a normal individual can identify the sequence order of presentation of such stimuli when they are immediately successive, that is, with no intervening interstimulus time gap.
The result of this fundamental problem in the reception of short-duration and fast successive components of speech is readily apparent in the school system, where individuals diagnosed with dysphasia and dyslexia with this temporal processing difficulty will run two-to-four, and maybe more, years behind their peers in scholastic achievement. The result is that L/LDs commonly require additional specialized training, with great emphasis on speech recognition and speech production. Dyslexics similarly receive special training to help them learn to read. Special speech reception, speech production and reading instruction continues generally throughout the elementary and secondary school educations for many of these individuals if the resources are available. The impairment can often lead to a truncation in education, and commonly results in impairment for life. However, some success is shown for special training.
Initially, failure of identification of consonant-vowel combinations such as /ba/ and /da/ with short duration consonant frequency modulations of less than 60 milliseconds, or failure to identify the temporal order of simple acoustic stimuli unless they are separated by 150 or more milliseconds, has established a means of identifying L/LDs with this temporal processing deficit. However, no prior training strategy has shown consistently positive results in overcoming the temporal processing deficits that underlie L/LDs. Overcoming this temporal processing deficit should result in a more useful and normal life for individuals with this affliction.
Recent studies have shown that these speech and language-based learning disabilities are seated in defective temporal processing of sensory information by the brain. Moreover, they have shown that temporal processing abilities are subject to strong learning effects in normal individuals. The basic processes underlying this temporal process learning are increasingly better understood.
In addition to L/LDs, brain damaged individuals have shown similar symptoms. In particular, individuals who have suffered strokes or otherwise damaged portions of their language-dominant cerebral hemispheres commonly lose the ability to discriminate between normal consonant sounds and show temporal processing deficits that are very similar to those in L/LD individuals. As with L/LD individuals, these aphasic individuals can also correctly identify speech elements when they are presented to the patient in a slowed-down form.
Aged individuals also show a progressive deterioration in their temporal processing abilities, as judged by these same tests. This deterioration contributes to a cognitive-based deficit that affects their speech reception and general cognitive abilities.
The reception of, or learning of, a foreign language in an indigenous environment is difficult and sometimes almost insurmountable for normal individuals because of the speed at which the language is spoken. Foreign languages are consequently learned by rote memorization and repeated practice exercises, with the speed of talking increased commensurate with the ability to understand the spoken language. There is no set means for individuals learning a foreign language in the indigenous environment (that is, in the native country of the language) except by asking the foreign language speaker to "slow down" or to "repeat". Most of the problems in learning foreign languages in this indigenous environment can be attributed to the lack of recognition in the temporal processing of fast events in one's brain of the incoming speech sounds.
While the phonemes of foreign languages differ in construction from the English language, the principles behind all spoken languages remain constant. That is, all languages can be broken down into fundamental sound structures known as phonemes. It is the recognition of these phonemes, such as the consonant-vowel syllables /ba/ and /da/ in the English language, that form the basic building blocks that must be learned. As with the L/LD individual, the foreign language student does not recognize these phonemes reliably when they are presented at their normal element durations and normal element sequence rates by native language speakers. As with L/LDs, they can be accurately distinguished from one another and can be correctly identified when the speech is artificially slowed down.
It is an object of this invention to provide a means for easier recognition of phonemes and connected speech in L/LD individuals.
It is a further object of this invention to provide a training strategy for rapidly and progressively improving the recognition of phonemes and connected speech in L/LD individuals.
It is another object of this invention to employ training signals that are more powerful than normal speech for generating changes in temporal processing by the brain achieved through learning exercises.
It is the further object of this invention to use a modified version of this training strategy as a method for screening human populations to identify those individuals who would benefit from this invention.
It is also an object of this invention to provide phoneme and connected speech recognition and a training strategy for rapidly and progressively improving the recognition of phonemes and connected speech in individuals that have suffered brain damage to their dominant speech-language hemisphere that has resulted in a temporal processing deficit like that recorded in L/LDs.
It is a further object of this invention to provide phoneme and connected speech recognition and a training strategy for rapidly and progressively improving the recognition of phonemes and connected speech in individuals who have undergone age-related or disease-related deterioration of their temporal processing abilities for speech sound reception.
It is still a further object of this invention to provide easier recognition of phonemes and connected speech in the learning of a foreign language.
It is still a further object of this invention to provide improved temporal processing of fast speech sounds in normally fluent individuals, to improve their learning capabilities and their potential cognitive achievements.
In one aspect of the invention, a method of increasing the ability of a human being to process aurally received signals is disclosed as a method. The method consists of recording audio sounds in a computerized system. The method includes a step of modifying the amplitudes and timings of recordings of certain phonetic elements without changing their fundamental frequencies. Finally, the invention includes converting the modified digital signals to analog aural signals for presentation to the individuals.
In a second aspect of the invention, a method for increasing the ability of a human being to distinguish and separate fast sequential aurally received signals is disclosed as a method. The method consists of controlling the sound output of fast phonetic or non-speech sounds in computer-mounted games, at which the human being works to correctly recognize progressively faster sound presentations, or in which the human subject works to distinguish the time order of presentation or the separate identities presented at progressively shorter durations, at progressively faster rates, and with progressively longer and longer sound element sequences.
The invention also encompasses a method for increasing the ability of a human being to recognize long, connected speech strings, and to rapidly improve their performances at related cognitive tasks. Speech sounds of training exercises designed for L/LD children are modified in a computerized system, as above. All speech applied in training exercises and on library materials are delivered in this modified form.
The invention also encompasses a computerized system for structuring recorded audio information to enable speech and language impaired or normal individuals to better understand spectro-temporally complex audio sounds. The system consists of a computer having input means for receiving information including the processor means for manipulating the received information; storage means for storing unprocessed, received information and manipulated information; and output means responsive to the processor means for presenting the manipulated information in a form understandable, and providing an effective learning signal, to a user. The computerized system also includes first program means for modifying digitally recorded audio sounds having a frequency range associated therewith to lengthen and to selectively amplify fast (primarily consonant) acoustic elements in speech without modifying the frequency range. Second program means are provided responsive to the received information and the processor means for storing the modified, digitally recorded speech sounds. Third program means responsive to the received information and to the processor are provided to direct the stored, modified, digitally recorded speech sounds to the output means.