There are many providers of systems for content filtering of text messages such as email. Typically, in these types of filtering systems, emails are organized according to a specified criterion. Most often, the filtering process is performed automatically. However, human intervention is sometimes used in lieu of, or in addition to, the automated filtering.
In theory, given enough time, the best and most accurate way to judge the suitability of the content of a text message is by human filtering. However, in practice, this is a slow, error prone and expensive endeavor.
In automated systems typically used for emails, software filtering is used wherein each email passes through the filter as either unchanged, redirected elsewhere or marked as junk. In some cases, the filtering software may edit the email message during the processing to change or delete any objectionable content.
Today, these software filters use various criteria for sorting emails. In some instances, filtering decisions are based on regular expression matching. In other instances, filtering decisions are based on keywords found within the message. Additionally, some systems use historical training data from previous emails as a guide in the classification process. The keywords used in some systems can include a list of suitable and unsuitable keywords, wherein their presence in a message is used to determine whether the message is acceptable or unacceptable.
As will be appreciated by those having skill in the art, prior email filters are limited in their ability to effectively filter short text messages or other messages wherein various abbreviations and creative misspellings are used as substitutes to correctly spelled words or phrases. Additionally, in some of today's email systems, “junk email” is generated by organized groups and not individuals. This manifests as regularities and detectable patterns in the junk mail. However, today's email filtering techniques are not capable of handling user generated content that is produced by millions of independent users, which have less discernable regularities and patterns in them. As a result, email filters cannot effectively discern abbreviations and misspellings that are substitutes for content that should not pass through the filtering process.
Accordingly, the unique invention disclosed herein provides a solution for the above discussed problems in filtering short text messages and other text messages.