Computer users are accustomed to using "checking" program modules (e.g., spell checkers, grammar checkers, and consistency checkers) for alerting users to words within a document that are questionable based on some predefined set of rules. For example, if a word is found in a document, but is not found in a spell checker's dictionary, then the word can be marked to indicate that it is questionable. Similarly, if a correctly spelled word is found in the spell checker's dictionary, but its spelling is inconsistent with other variants of the word in the same document (e.g., color and colour), then the lesser-used variant (or all of the variants) might be marked as questionable.
Japanese language consistency checkers are typically more complex than English language consistency checkers because Japanese consistency checkers must accommodate multiple acceptable spelling variants of a particular word. Typically, a document of Japanese text employs more than one writing system, with each system having a unique character set. The most commonly used Japanese writing systems are Kanji, Hiragana, and Katakana. Kanji is a writing system composed of pictographic characters, mostly derived from Chinese writing systems. Hiragana is a writing system that is phonetic in nature and shares no common characters with Kanji. Katakana is another phonetic writing system that is primarily used for writing words borrowed from Western languages, and also shares no common characters with Kanji. Kanji pictographs are analogous to shorthand variants of Hiragana words in that any Kanji word can be written in Hiragana, though the converse is not true. A single Japanese word can include characters from more than one writing system. For example, a correctly spelled word may be written using two Kanji characters, one Kanji character followed by two Hiragana characters, or by four Hiragana characters. In short, the challenge presented to consistency checking programs by documents containing Japanese text is that a variety of words can be acceptable variants of one another. Therefore, a Japanese word consistency checker must be complex in order to accommodate all acceptable variants.
A problem with currently available Japanese consistency checkers is that they do not provide a sufficient means for generating all of the common Japanese spelling variants. Because a document employing more than one Japanese writing system may include many acceptable word variants, the user may desire to be prompted when a word has been spelled inconsistently with other occurrences of the same word variant. That is, when one variant is different from others in the same document. Currently available Japanese consistency checkers utilize manual variant generation, thereby incurring the risk of overlooking common spelling variants.
Accordingly, there is a need for a Japanese language consistency checker that is capable of providing a method for identifying and generating substantially all acceptable spelling variants of a particular Japanese word. The Japanese language consistency checker should also be capable of identifying spelling variants that are used inconsistently with other spelling variants in the same document. The consistency checker should also be capable of maintaining statistics of spelling variant uses within a particular document, thereby enabling the consistency checker to identify lesser-used variants.