Speech recognition systems or handwriting recognition systems for the translation of human input to electronic data are currently being developed.
One type of speech recognition system is the command and control type. The command and control type uses a grammar to format the input for more accurate recognition. A grammar may constrain the input to the system in some way so as to reduce the number of possible inputs. For example the input may be constrained to be an “action” followed by a “target” of the action. For example, if the user inputs the spoken words “open a file” then “open” would be recognized from a group of action words and “file” would be recognized from a group of target words. The constraints placed upon the system by a grammar are language specific and can yield high recognition accuracy.
A second type of speech recognition system is the dictation type. The dictation type of speech recognition is not constrained by a specific grammar, but by a less stringent language modeling. Whatever input is spoken the system will attempt to recognize the input word by word using statistical information. This system is more flexible, but yields lower recognition accuracy. The accuracy level of the dictation type of speech recognition system is currently too low to provide a generally practical system.
These two types of speech recognition systems have their counterparts in the handwriting recognition arena. The “pen gesture” type of handwriting recognition system is analogous to the command and control type of speech recognition system in that the input is constrained by formatted structures known as a template. The less structured handwriting recognition system is simply known as handwriting and attempts to recognize handwritten input at the letter or word level without a required format.
It has been recognized that using both a speech recognition system and a handwriting recognition system in tandem could significantly improve the speed and accuracy of a translation system. The coupling of the constrained type of each system (i.e., the command and control type of speech recognition system and the pen gesture type of handwriting system) has been possible because both use a constraint system. The command and control type of speech recognition system uses a grammar and the pen gesture type of handwriting recognition system uses a template. The grammar and the template function similarly in their respective systems. The coupling of the two modalities has yielded improved system performance.
A more flexible and versatile system would be the coupling of the least constrained types of each system (i.e., the dictation type of speech recognition system and the handwriting type of handwriting recognition system).
An accurate, robust, and easy to use translation system for human input is especially important for Chinese language users. The Chinese language is made up of tens of thousands of pictographic words known as ideograms that are combined to create other words. One common way to input information to a processing system is through the use of Pinyin. Pinyin is a system for transforming Chinese ideograms into Roman alphabet based words. For many people who use Chinese as their native language, it is an arduous task to input information into a processing system. This task discourages many from accessing the many devices that rely on human input to a processing system. Whereas in the English language many people can type words faster than they can speak them, this is much more difficult when the words must be initially translated from Chinese ideograms.