Automated proofing tools for texts written by persons who are not native language speakers suffer from some problems. By native language, it is generally meant the language that is learned first by a particular individual, although, in some instances, that may not necessarily be the case. Increasingly, people around the world create texts in languages other than their native languages. Most notably, a number of people who aren't native English speakers create texts in English. These texts can be created in word processors, e-mail applications, or web page development software, to name a few examples. Despite the large and growing number of people who prepare such documents outside of their native language, useful editorial assistance in the form of proofing tools geared to their needs is surprisingly hard to obtain.
Proofing tools such as grammar checkers available in word processors and other text generation tools have been designed primarily with native language speakers in mind. However, such tools do not address the challenges of proofing texts written by persons that are not native language speakers. For example, a major difficulty associated with using native language centric proofing tools to proof text written by a non-native language speaker is that errors of grammar, lexical choice, idiomaticity, and style rarely occur in isolation. Instead, any given sentence produced by a non-native language writer may involve a complex combination of all these error types. Consider the following example, found on the World Wide Web and written by someone whose native language is Korean, which involves the misapplication of countability to a mass noun:                And I knew many informations about Christmas while I was preparing this article.        
When proofing tools implemented to proof text written by native language writers are used to examine this text, they correctly (in the context of the examination of a native language writer's text) suggested that “much” should be substituted for “many” and “information” should be substituted for “informations”. Despite these changes, the resultant sentence, “And I knew much information about Christmas while I was preparing this article”, does not read as if it were written by an experienced, native language writer. Substituting the word “much” for “many” leaves the sentence stilted in a way that is probably undetectable to an inexperienced non-native speaker. In addition, the use of the word “knew” represents a lexical selection error that falls well outside the scope of conventional proofing tools. A better rewrite of the original sentence might be:                And I learned a lot of information about Christmas while I was preparing this article.or, even more colloquially:        And I learned a lot about Christmas while I was preparing this article.        
Repairing the error in the original sentence, then, is not a simple matter of fixing an agreement marker or substituting one determiner for another. Instead, wholesale replacement of the phrase “knew many informations” with the phrase “learned a lot” is needed to produce idiomatic-sounding output. It is difficult enough to design a proofing tool that can reliably correct individual errors; the simultaneous combination of multiple errors is beyond the capabilities of current proofing tools designed for native speakers.
Moreover, despite growing demand for proofing tools that address the needs of non-native language writers, there has been remarkably little progress in this area. Research into computer feedback for non-native language writers remains largely focused on smallscale pedagogical systems implemented within the framework of CALL (Computer Aided Language Learning). In addition, commercial grammar checkers for non-native language writers remain brittle and difficult to customize to meet the needs of non-native language writers of different native language backgrounds and skill levels.
Some researchers have begun to apply statistical techniques to identify learner errors in the context of essay evaluation to detect non-native text and to support lexical selection by non-native language writers through first-language translation. However, none of this work appears to directly address the more general problem of how to robustly provide feedback to non-native writers in a way that is easily tailored to different native language backgrounds and language skill levels in the non-native language in which they are writing.
The discussion above is merely provided for general background information and is not intended to be used as an aid in determining the scope of the claimed subject matter.