1. Technical Field
The present invention relates generally to analysis of human speech and, in particular, to an improved method and apparatus for providing visual feedback relative to speech production.
2. Description of Related Art
Most people take human speech for granted. However, various speech impediments or physical deficiencies may impair an individual""s abilities to produce what may be considered xe2x80x9cnormalxe2x80x9d human speech. Speech pathologists are professionals who work with individuals who cannot speak in a normal manner. Typically, a speech pathologist will work with such an individual over a period of time to teach the individual how to more accurately produce desired sounds.
A speech pathologist encourages such an individual to concentrate on the articulators that produce acceptable speech. These articulators include the lips, teeth, the tongue, etc. Conventionally, a videotape player and a mirror are used to allow an individual to compare the individual""s externally visible articulators with those of a model. However, a videotape player does not allow for easy replay of short speech production models. Furthermore, people may suffer from left-right confusions due to, for example, neurological damage, learning disabilities, and possible visual processing problems. Therefore, the comparison of a mirror image with a videotape reproduction may create confusion for such an individual.
Computers and computer software provide tools to improve the tasks of a speech professional. These software tools analyze an incoming speech sample with comparisons to a stored speech sample to determine whether a particular sound, such as a phoneme, has been made correctly. Once a model is created, an incoming sound may be compared to the model. If the incoming sound does not fit within the range of the model, the user is notified of the discrepancy.
However, the prior art speech and language analysis software tools provide feedback based only on acoustic information. Therefore, it would be advantageous to provide visual feedback of speech production and to associate a speech model with the articulators responsible for speech production.
The present invention collects video and audio samples of acceptable speech production. A camera focuses on a speaker""s face and, particularly, articulation visible in the vicinity of the mouth, or other body movements associated with speech production. Video files are used to archive acceptable and unacceptable productions, as well as acceptable facial expressions that enhance communication. These files may then be used to provide feedback about acceptable and unacceptable ways to produce speech. The camera is also used to provide real-time feedback as a person is speaking for comparison with a stored model. A speaker may use video models in conjunction with acoustic models for comparison with a current attempt. Image processing may be used to create a mirror image of a video model or a current attempt or both to avoid left-right confusion.