With the rapid advancement in technology, the mode of communication and the means used for communication has improved by leaps and bounds to meet the ever increasing demand of the population.
“Short messaging service” acronym SMS, chatting and e-mails are some of the common communication modes used by people all over the world. These communication modes are cost effective, easy and comfortable.
In recent times PDAs, palmtops and handheld personal computer (PC) are more frequently being used for composing short messages (SMS) and e-mails. These messages are generally composed in English using the conventional keyboard of PC's or regular keypads of mobile handsets.
The biggest challenge for word processing in other languages such as Germanic, Slavic, Romanic and Indic languages is a vexing experience, considering the constraint to use the regular keyboard, designed for English language.
The present day input mode of communication means often tends to be of less user friendly for individuals originating from places like India especially because of the several existing Indic (Indian) languages scripts. Further communicating by mode of short messages and e-mails in scripts of these languages using the conventional keyboard or mobile keypad is both difficult and time consuming.
A solution that has been employed is the transliteration of these language (Germanic, Slavic, Romanic and Indic languages) texts in English, which allows the use of the English keyboard to enter the scripts of these language texts. However this requires the user to be able to write the non-English language text in English alphabets which requires English literacy.
For this reason a feasible and probably the only option for script independent message composition for the English non-literate population that is paving way, is to the use the electronic pen (e-pen) or a stylus touching a pressure sensitive surface in lieu of the keyboard to write sub-word units.
Hence, online handwritten character recognition (OHCR) is of prime importance especially in the context of communicating short messages and e-mails for script independent languages.
Though advantageous, online handwritten character recognition (OHCR) is available for English, Chinese and Japanese languages, and surprisingly relatively less work has been reported for language scripts such as Germanic, Indic, and Romanic and so on.
Hence there is an urgent need to provide a method and system to enable online script independent recognition of sub-words and words.
Some of the inventions which deal with providing online handwritten script recognition are as follows:
U.S. Pat. No. 5,550,931 titled “Automatic handwriting recognition using both static and dynamic parameters” provides a method and apparatus for recognizing handwritten characters in response to an input signal from a handwriting transducer. Though '931 patent provides a feature extraction and reduction procedure that relies on static or shape information, it fails to relate the temporal order in which points are captured by an electronic tablet.
U.S. Pat. No. 6,011,865 titled “Hybrid on-line handwriting recognition and optical character recognition system” provides a method and a system for hybrid on-line handwriting recognition and optical character recognition. Though '865 patent provides a handwriting recognition system and method that employs both online and off-line Hand writing recognition to achieve a recognition accuracy that is improved over the use of either technique when used alone, it fails to provide and perform feature extraction and spatio-temporal analysis for ascertaining if the recognized sequences of strokes are valid or not and thereby which enhances character recognition accuracy.
U.S. Pat. Nos. 4,284,975, 6,389,166 and 4,365,235 disclose a pattern recognition system operating in particular for Chinese handwritten characters, online handwritten Chinese character recognition apparatus based on character shapes and a Chinese/Kanji online recognition system consisting of tablet electronics module, a signal filter and segment integration unit, a base stroke classification unit, a symbol element recognition unit and a symbol recognition output table respectively.
U.S. Pat. No. 7,587,087 titled “On-line handwriting recognition” discloses a method and a device for on-line handwriting recognition; wherein the use of at least one auxiliary line is displayed on a touch sensitive panel. Each of the auxiliary lines constitutes a portion of more than one character of a character set. A character of a character set is drawn on the touch sensitive panel by completing one of the at least one auxiliary line into the character. The drawn character is recognized on the basis of said completion. Though '087 patent relates to online handwriting recognition, it fails to recognize the stroke leading to recognition of the character. Instead the character is recognized only on completion of writing the character.
US patent application number 20060126936 titled “System, method, and apparatus for triggering recognition of a handwritten shape” discloses a technique that uses repetitive and reliably recognizable parts of handwriting, during digital handwriting data entry, to trigger recognition of digital ink and to repurpose handwriting task area properties. Though '936 application discloses a system and method for handwritten shape recognition, it fails to provide and perform feature extraction and Spatio-temporal analysis for ascertaining if the recognized sequences of strokes are valid or not and thereby enhancing character recognition accuracy. Further, the recognition technique of '936 patent application attempts to find the character that most closely matches the strokes entered on the tablet and returns the results on run instead of showing results when the user finishes writing.
US patent application number 20080159625 titled “System, Method and Apparatus for Automatic Segmentation and Analysis of Ink Stream” discloses a technique that provides for real-time segmentation of handwritten traces during data entry into a computer. Though '625 application discloses a system and method for automatic segmentation and analysis of ink stream, it fails to provide and perform feature extraction and lexicon based domain specific word knowledge for recognition of characters and words.
US patent application number 20090003705 titled “Feature Design for HMM Based Eastern Asian Character Recognition” provides a method for online character recognition of East Asian characters includes acquiring time sequential, online ink data for a handwritten East Asian character, conditioning the ink data to produce conditioned ink data where the conditioned ink data includes information as to writing sequence of the handwritten East Asian character and extracting features from the conditioned ink data where the features include a tangent feature, a curvature feature, a local length feature, a connection point feature and an imaginary stroke feature. Though '705 application discloses a system and method for Eastern Asian Character Recognition, it fails to identify and construct a primitive stroke database which encompasses the handwritten script and the recognition engine primarily which recognizes their primitives prior to character and word recognition and further does not provide for a lexicon based domain specific word knowledge used for identification of characters and words.
PCT application number 2006090404 titled “System, Method, and Apparatus for Accommodating Variability in Chunking the Sub-Word Units of Online Handwriting” provides a technique for automatic real-time segmentation of an ink stream that does not require learning any chunking methodology, style of writing, and/or a predefined symbol set. In one example embodiment, this is achieved by drawing one or more strokes associated with a desired word of a script in one or more boxes provided on a digitizer screen using a pen. Though '404 application discloses a system and method for online handwriting recognition, it fails to provide for feature extraction and spatio-temporal analysis for ascertaining if the recognized sequences of strokes are valid or not and thereby enhancing character recognition accuracy.
The current state of arts restricts the universal application of the short messaging service and e-mail communication mode for script dependent online handwritten sub-word unit and words recognition. Hence there is an urgent need to provide a method and system to enable communication using the existing Short messaging service and e-mail communication means by employing an application for online recognition of script independent handwritten sub-word unit and words.
In light of the above mentioned prior arts it is evident that there is a need to have a customizable solution for online script independent recognition of handwritten sub-word unit and words.
In order to address the long felt need of such a solution, the present invention provides a method and system for online script independent recognition of handwritten sub-word unit and words. More particularly the present invention relates to a system and method which enables online recognition of script independent sub-word unit and words by recognizing the written individual strokes prior to recognition of sub-word unit and words.