There is a long-felt need for an automated way of recognizing hieroglyphic based written languages such as Chinese, Japanese, Korean and other languages.
There are several thousand characters in the Chinese language. For example, there are nearly 7000 Chinese characters supported by the GB coding standard used in China, and about 13000 Chinese characters supported by the Big5 coding standard used in Taiwan and Hong Kong. Some words in Chinese have just one character in them, while the majority of words have two or more characters. The average number of characters in a word is about 2.5. The large number of characters makes the design of a Chinese handwriting recognition system a difficult one. Previous research has considered breaking down handwritten Chinese characters into sub-structures. These sub-structures then constitute an alphabet for handwritten Chinese, and all handwritten Chinese characters can be expressed using this alphabet. A known choice for sub-structures in Chinese characters is the set of radicals used in a Chinese dictionary. Radicals are of limited use as sub-structures when it comes to Chinese handwriting recognition, due to the relatively large number of them (there are between 500 and 600 radicals), and the difficulty in machine extraction of radicals from a handwritten input. Similar problems exist in recognizing writing in other languages.
A popular paradigm for character recognition of handwritten Chinese input is to store one or more templates for each character of interest and using a nearest neighbor classifier to find the identify of the handwritten input. A nearest neighbor classifier is one that simply finds the nearest or best matching template and reports the identity of that template as the identity of the handwritten input. Since the Chinese language has several thousands of characters, this paradigm requires a relatively large amount of memory to store the templates for all the characters.
There is a need for a better alphabet for a handwritten language, with handwriting recognition in mind, and a need for an improved method of handwriting recognition.