A psychoacoustic principle of hearing and speech production is that an individual has a certain comfort rate at which they speak. This rate is also mediated by their own auditory system, i.e., a person talking hears themselves talking both internally and through their speech entering their ears. It is known in speech communication research that a talking individual establishes a speaking rate based on the hearing of his or her own speech which conforms to this internal comfort speaking rate. By adjusting the feedback speech rate between what the speaker is saying and what the speaker hears himself saying, it is possible to psychologically coerce the speaker to change their speaking rate. In effect, if language communicated by a speaker is slowed down and played back through headphones or a loudspeaker device to the speaker while the speaker is talking, the speaker will slow down his speaking rate in an attempt to maintain the speaking rate they are hearing. This is the result of a self-correcting mechanism in the motor language model of speech production, which balances the rate at which speech is spoken to the rate at which that speech is heard internally. The motor language model describes speech production as the coordination of muscular actions in the respiratory, laryngeal, and vocal tract systems. It is a feedback mechanism, which attempts to minimize the speaking rate difference between what is heard and what is being spoken. Motor control is described as the planning and coordination of muscle movements of the articulatory gestures in speech production from sensory feedback.
The Lombard effect in speech describes how people change their speech in noisy surroundings with the most obvious change to simply speak louder. The Lombard effect is one example of self-auditory feedback, which psychologically encourages a talker to speak louder than the level of the surrounding sounds they are hearing. The talker places emphasis on certain sections of the words to improve the discernibility and hence intelligibility of the speech. Consider when you speak to someone at a concert; you “pronounce” words differently. Many algorithms have tried to capture this behavior to improve the intelligibility of reproduced speech in voice communication systems. None have been able to do so yet. The psychological effect of hearing background noise while speaking is a feedback mechanism, which typically compels a person speaking to speak with different articulation.
Similarly, there are speech/hearing devices in which speech is captured through a microphone and played back to the talker while they are talking. These are seen on sports newscasts where a hearing device lets the talker hear what they are saying. Additionally, this principal has been used intentionally with a delay in the hearing device playback for people with stuttering disabilities. Studies have shown that speech played back to a stuttering talker while they are talking can lessen the number of their stutters. The psychological feedback mechanism with the delay allows them to hear themselves just prior to formulation of the articulator gestures. This additional delay smoothes their speaking.
Another area of audio playback is where a user plays back and listens to audio messages. These messages may be recorded on a digital tape recorder, personal digital assistants, or a voice messaging services. One common complaint of voice message services (i.e. voice recorders, telephone recorders, voice notes) is that the person who left the message is either talking too fast, too slow, or a combination of both. In many instances a person leaves a long voice message with a quick telephone number at the end. The voice message in the beginning is spoken slowly but the number is spoken fast. This usually means the message has to be replayed, and each time the long message has to be heard and quick attention is needed to hear the number. In another example: Your fast talking teenage daughter leaves you an important message about a sale at the mall. A problem exists when listening to the message such as changing the playback rate of a voice message. Accordingly, a need exists to be able to change the playback rate of your voicemail message.
Recently many electronic devices such as digital tape recorders, telephones, personal digital assistants and other devices permit the user to record memos. The recording options many times include fast-forward and rewind features, which allows a user to index forward or index backward while playing recorded messaging. This feature allows them to skip ahead or jump back to certain sections of the voice note to hear. However, it only allows them to position the voice playback of the speech. It does not allow them to hear the speech as they are indexing or to change the playback rate.
Further, many existing electronic devices including voice recorders, telephone handsets, and personal digital assistants have limited available memory for the audio output buffer. The audio output buffer is typically the buffer in which the audio samples being played out through an analog-to-digital (A/D) converter retrieves speech samples for playback. The voice buffer is kept small and the DSP, or process controlling the A/D, typically runs at a rate sufficient to play back the digital speech samples. Placing faster DSPs or more memory is not an option because designers strive to conserve battery power and to avoid additional component costs. Moreover, solutions, which are backward compatible with existing hardware platforms, is typically more desirable.
Therefore a need exists to overcome the problems with the prior art as discussed above.