In recent years, text processing systems have been used to process text (a corpus) and derive statistics concerning the text. Such statistics can be used for developing language models, creating classification models, spellchecking, plagiarism detection, etc. One example statistic that may be calculated is a count of n-grams that appear in the text.
The figures are not to scale. Wherever possible, the same reference numbers will be used throughout the drawing(s) and accompanying written description to refer to the same or like parts.