Technological Field
The disclosed embodiments generally relate to systems and methods for processing audio. More particularly, the disclosed embodiments relate to systems and methods for processing audio to identify speech prosody.
Background Information
Audio as well as other sensors are now part of numerous devices, from intelligent personal assistant devices to mobile phones, and the availability of audio data and other information produced by these devices is increasing.
Cluttering, also known as tachyphemia or tachyphrasia, is a communication disorder characterized by rapid rate of speech, erratic speaking rhythm, loss of fluency, frequent pauses, and so forth. Aprosodia is a neurological condition characterized by difficult or inability to properly convey or interpret emotional prosody. Dysprosody is a neurological disorder characterized by impairment in one or more of the prosodic functions. Apraxia of speech is a communication disorder characterized by difficulty in speech production, specifically with sequencing and forming sounds, impaired speech prosody, and in particular impaired speech rhythm. Prosody may refer to variation in rhythm, pitch, stress, intonation, accent, vocal quality, intensity, tempo, flatness, melody, pauses, timing, and so forth. Impaired prosodic functions are also a possible symptom of several other neurological and psychiatric conditions, including: autism, Asperger syndrome, schizophrenia, clinical depression, aphasia, neurodegenerative conditions, and so forth.