The prior art is replete with word processing programs, including a couple of contemporary favorites, Microsoft WORD and Novell's Wordperfect, that are used by a substantial portion of IBM-compatible computer users. These programs are used in known ways for permitting authors to create electronic text (and graphics) documents. As a part of such word processing program, a spell-checking routine is almost always included to help authors reduce the number of unintentional text errors in such documents. A number of prior art patents are directed to this feature, and a reasonable background of the same is described in U.S. Pat. No. 5,604,897 to Travis and U.S. Pat. No. 5,649,222 to Mogilevsky, both of which are hereby incorporated by reference.
It is apparent, however, that spell-checking routines associated with such word processing programs have a number of limitations. Key among these is the fact that they cannot determine whether a particular word choice, while accurately spelled, is nevertheless perhaps inappropriate for the particular context within a particular document. As an example, many words that may be intended by a drafter (such as the words “ask,” “suit,” “public,” etc.) can be transformed into potentially offensive words merely by changing a single letter in such words, transposing a few letters, or by mistakenly adding or dropping a letter. These transformed words, however, will still pass the spell-checking facility, because many of them include even a number of offensive words as part of their standard dictionary. For example, the word “ask” may be inadvertently written as “ass” and unless the message is intended to discuss issues pertaining to certain members of the animal kingdom, it is likely to be an inappropriate word choice. If these inadvertent mistakes are not caught by the drafter during a later review, they will be included in such document and potentially communicated to one or more third parties.
The possibility of such errors is increasing each day because of a number of driving factors, including the fact that standard dictionaries for word processors are growing in size to accommodate the largest number of words of course in a particular language. While one solution may be to not include such words in an electronic dictionary in the first place, this result makes the creation of such dictionaries more complicated because an initial censoring must be done before the words are even translated into electronic form. Moreover, this solution does not help the user to identify inappropriate words that may be skipped over during a spell-checking routine.
Another factor leading to increase in electronic word choice errors is the fact that many electronic documents are never reduced to a physical form before being disseminated. In many instances a glaring error is caught by a human inspection of a printed page before it is sent out. The so-called “paperless office” while improving efficiency and reducing waste also naturally causes a larger number of inadvertent message errors in text documents. Additional errors can even be induced by spell-checkers because when they detect a mis-spelled word, they will often provide a menu of potential word choices as replacements, and it is remarkably easy to select an inappropriate word choice from such menu, again merely by accident. Such errors of course will not be detected because the document is erroneously considered to be “safe” by many users after spell-checking has completed and they will not check it again. In other words, some facility for checking the spell-checker dynamically is also desirable, but does not exist at this time.
There is some facility in the prior art for permitting users to create so-called “exclusion” dictionaries for analyzing text documents. An example of such kind of system is illustrated in U.S. Pat. No. 5,437,036 to Stamps et. al, which is incorporated by reference herein. A drawback of this approach, however, lies in the fact that it requires the user to both divine and manually input all the potential mis-spellings that could occur, and even if they had the time, there are obviously an endless variety that might never be considered by such user. For example, a user may not have the foresight to notice that a simple transposing of two characters (a common error) may generate a word that is extremely offensive. Furthermore Stamps et. al. do not appear to contemplate the possibility that the act of rendering a document “spelling” error free may itself generate unintended word selection errors. As such, therefore, Stamps et. al. is not truly a “word” checker, but, rather, an enhanced spell checker that has been sensitized to a particular user's poor spelling habits. While it incidentally determines whether a word is perhaps not the intended choice of the author (i.e., that the word does not have a particular meaning), it does not perform the important step of determining the precise meaning of the word, and in particular whether the word also has a potentially inappropriate meaning as well.
A few methods for proof-reading electronic documents are also known in the art. A U.S. Pat. No. 4,674,065 to Lange et. al., also incorporated by reference herein, describes a technique for detecting word context errors in a document. This technique seems limited to homophones however (for example, it knows to see if a user intended to use the word “course instead of “coarse”) and is not generally applicable to the problem of determining inappropriate use of language in documents. For example, unless a particularly offensive word has a homonym, Lange et. al. would not even detect such word as being a problem. The approach of Lange et. al. further requires a fair amount of computational complexity, since it must analyze the text preceeding and following after a word and use a complicates set of syntax rules to determine whether the word is being used in context correctly. This fact alone makes it essentially unusable for most contemporary word processing programs which utilize background spell checking, dynamic spell-checking, etc.
Finally, a U.S. Pat. No. 4,456,973 to Cargren et al., and also incorporated by reference herein, discusses the use of an electronic word dictionary that has an associated code field for indicating the level of comprehensibility of such word. For example, the word “abandon” is coded with a numerical designation 6, indicating that the word is probably understandable by children at the 6th grade level. Cargren et al., however, do not appear to address the more general problem of identifying text that has been inadvertently mis-spelled by an author, and which is likely to be inappropriate. In other words, the Cargren al. approach presumes that the user has correctly input the word in question, and unless the word is coded with a rating below that of the intended grade group of children, it is not flagged in anyway. It is apparent that this method of encoding is fairly impractical for use in an electronic dictionary intended to be used by an adult population, because adults are not classified in this way. In fact, if a target audience of a document is intended to be primarily adults, then the Carlgren et al. approach would not flag any words at all, because they would probably be presumed to be operating at the highest level of education (12), thus rendering this type of filtering essentially useless. In addition, there is no facility mentioned by Cargren et al. for detecting words that are likely to be offensive, even if consciously selected by the author. For example, the use of the word “dame” may be consciously selected but nevertheless undesirable in communications in which the intended audience is primarily adult women. A drafter of an electronic document may desire to be notified of such potentially offensive words if they are known to be sensitive.