Kanji is a Japanese system of writing that utilizes characters borrowed or adapted from Chinese writing. The elements of grammar in Kanji are known as "Kanji characters." The phrase "elements of grammar" refers to units of a given natural language that are capable of comprising parts of speech. For example, the elements of grammar in the English language are words. As such, each Kanji character is a higher order linguistic symbol that is analogous to a word in the English language. That is, natural languages tend to have three levels of linguistic elements. The lowest of these levels depends on the specific alphabet used and is associated with the sounds of the spoken language. For example, the first and lowest level of linguistic elements in the English language comprises letters. The third level of linguistic elements is the highest level and contains linguistic elements conveying full creative expression. In the English language, the third level comprises sentences. It is the second level of linguistic elements to which the phrase "elements of grammar" refers. This second level is an intermediate level of linguistic elements and in the English language, the second level comprises words. In Japanese, the second level comprises Kanji characters.
Kanji characters typically comprise radicals. A "radical" is a part of a Kanji character, much like letters are part of a word. Oftentimes, a radical is itself a Kanji character. For example, FIG. 1 depicts a Kanji character 100 that comprises two radicals 102 and 104. Radical 102 is the "day" radical and radical 104 is the "month" radical. When combined, the resulting Kanji character 100 means "open." There is a well-known, standard set of 214 radicals that are referred to as "traditional radicals." FIGS. 2A and 2B depict the set of traditional radicals 200. Within the set of traditional radicals 200, each radical is enumerated from 1-214 with alternative drawings indicated with either parenthesis or brackets (e.g., "(32)").
Some conventional computer systems for recognizing Kanji handwriting have focused on recognizing traditional radicals in order to recognize a Kanji character. This technique is known as "radical recognition." These conventional systems have attained higher accuracy in recognizing Kanji characters over previous systems, and have reduced the amount of data that must be stored when performing Kanji character recognition. However, the conventional radical recognition approach suffers from a few drawbacks. First, it is difficult to determine which radicals of the traditional radicals should be used. Some of the traditional radicals are individual ("atomic") radicals and others are combinations of atomic radicals. Hence, a decision must be made whether to use the atomic radicals, the combination radicals, or both. A second drawback is that after the set of radicals is determined, each radical typically must be manually entered into a database and mapped onto the Kanji characters that utilize the radicals. This procedure is time consuming. The third drawback stems from the conventional approach being nonextensible. That is, the conventional approach cannot be used with non Kanji-based languages. Also, after the radicals are mapped onto the Kanji characters, if the system is to be extended to recognize new Kanji characters, the set of radicals and the set of Kanji characters that are recognized usually have to be augmented manually, which is a time consuming task. That is, the additional Kanji characters have to be entered manually into the system and associated with their component radicals. Augmenting the set of Kanji characters that are recognized is a likely possibility since there are over 500,000 Kanji characters and most Kanji handwriting recognition systems only recognize a few thousand. Based upon these drawbacks, it is desirable to improve conventional radical recognition systems.