The present invention relates to natural language processing. In particular, the present invention relates to grammar checker processing of natural language text.
A computer program that checks a user's grammar for correctness is called a grammar checker. Upon finding a mistake, a grammar checker usually flags the error to the user and suggests a correction. Users find grammar checkers to be very helpful. However the usefulness is dependent on the quality of the corrections the grammar checker suggests. The higher the accuracy, the happier the user.
Grammar checkers are often evaluated by independent sources who look at the effectiveness of the grammar checker. During the evaluations the testers or press reviewers often introduce unnatural errors into a textual input (in addition to natural ones), to see whether the grammar checker is able to identify and provide a correction to these unnatural errors. For example, in French, pronoun word order is extremely important, and is dependent upon the specific inflection of the verb in the sentence. Often in a test text the tester will provide a textual input such as “Il en leur parle.” (He talks to them about it.) to test the effectiveness of the grammar checker. This input has a pronoun word ordering error. The grammatically correct version of this phrase is “Il leur en parle.” The tester is looking for the grammar checker to identify the ordering error, and to provide a correct suggested correction to the user.
However, traditional French grammar checkers have ignored this type of error, because it is assumed that a native speaker would not make this ordering error. Thus, this error is referred to as an unnatural error. As current generation grammar checkers are not programmed to recognize this error, no indication is made to the user. If an indication is made to the user, there is no viable correction provided. This results in a lower rating by the tester of the grammar checker.
Further, with the advent and increased use of machine translators, these types of ordering errors are becoming more frequent in translated texts, because the machine translator by design does not understand all of the grammar rules for the specific language that a text is being translated into. Machine translators often return results that include word ordering errors, caused by different sentence structures of the two languages, ignored or omitted elision (contraction) (e.g. le en in French should be l'en), and ignored or omitted compounding (e.g. in Hebrew and Arabic prepositions followed by pronouns are compounded, for instance,  (to you) should be  in Hebrew and  (for him) should be  in Arabic). Often times these errors are missed by a reader when reviewing the translated documents because the reader makes the correction of the errors mentally without realizing it. Therefore it is desirable to have a grammar checker that is able to identify these errors and provide corrections to enhance the finished machine translation documents.