The present invention relates to a method and apparatus for producing an image of speech information, particularly a symbol indicating one of a plurality of groups of detected sounds, and projecting that image in a mode indicating a sequence of syllable into the field of view of a hearing impaired wearer of the device.
Innumerable situations exist in which it is desirable to supply information to an individual by superimposing an image onto his normal field of vision. One example such a display is needed is for the projection of symbols indicating one of a plurality of detected groups of sounds onto the field of vision of a deaf or heating impaired person.
Communication in any spoken language is made up of sequences of sounds which are called phonemes. By observation of the movements of the lips of a speaking person, a hearing impaired or deaf person can discern that each sound is one of a limited number of possible phonemes. Unfortunately, however, the ambiguities for a totally deaf person are too great for effective communication to take place using only lipreading.
If a person has some aid in resolving ambiguities, for example, understanding of an additional 10-20% of phonemes in addition to those understood by lipreading alone, then enough of the information in the speech can be understood by a trained lipreader for effective transfer of information. Often a lipreader will have limited hearing sufficient for this purpose. Alternatively, manual cuing, a technique developed by Orin Cornett of Gallaudet College, and one of the co-inventors of the present application, utilizes hand cues to remove sufficient ambiguities to make lipreading practical. The difficulty with manually cued speech, of course, is that it can be used only with those individuals who have been trained to use it, thus limiting severely the number of people whom a deaf person can understand.
The different sounds of any language have different waveform characteristics which permit limited differentiation into different groups of sounds. These basic analyzing techniques are old and are described, for example, in pages 139-158, J. L. Flanagan, Speech Analysis, Synthesis and Perception, Academic Press, 1965. Using these analytic techniques, signals can be produced from detected spoken sounds, each signal indicating one of a plurality of different sound groups. The sounds in each group are differentiable on the lips so that, if this information can be effectively communicated to the lipreader, sufficient ambiguities can be removed to permit effective lipreading.
One way to communicate sufficient information to a lipreader to make lipreading truly effective is to superimpose a symbol identifying a sound group upon the viewer's field of vision which he can see as he watches a speaker's lips. This basic technique is described in two patents to Upton U.S. Pat. Nos. 3,463,885 and 3,936,605. In both of these patents a display disclosed which is mounted upon a pair of spectacles intended to be worn by the hearing impaired or deaf person. In the system described in Upton U.S. Pat. No. 3,463,885, three types of sounds are detected--fricative, plosive and voiced. A number of bulbs are mounted on a lens of the spectacles, and each associated with one of these types of sounds. The associated bulb is activated when that type of sound is detected. In one embodiment, sounds which are a combination of these different types of sounds activate more than one bulb. In another embodiment, separate bulbs are utilized to denote combinations.
One of the difficulties with the system of Upton is that each of its indications is that of a single phoneme, and, at normal rates of speaking, the sounds occur so quickly that it is doubtful that they can be effectively used at that rate by the brain. According to the present invention, this problem is reduced by displaying information as syllables, i.e., normally a combination of a consonant sound and a vowel sound, although occasionally a single phoneme can be a syllable. One way that syllable information can be display is with a symbol indicating one of a plurality of consonant groups in a mode indicating an associated vowel group. For example, a symbol indicating one of nine consonant groups can be projected to one of four spatial locations. i.e., quadrants, the spatial location indicating the associated vowel group. Another approach is to project the symbol in one of a number of colors, for example, four, each color indicating an associated vowel group.
As such, the present invention has the object of automating manually cued speech.
Other objects and purposes of the invention will be clear from the following detailed description of the drawings.