Technical Field
Embodiments of the present invention relate to classifying content, and more specifically to on-the-fly pattern recognition with configurable bounds.
Background
Today, many entities are increasingly concerned with the use of their computing and networking resources to access the Internet. Various content filtering mechanisms are available to manage and/or control user access to contents (e.g., web pages and/or emails) from the Internet via facilities provided by the entities. Contents as used herein broadly refer to expressive work, which may include one or more of literary, graphics, audio, and video data. For example, a company typically implements some form of content filtering mechanism to control the use of the company's computers and/or servers to access the Internet. Access to content within certain predetermined categories using the company's computers and/or servers may not be allowed during some predetermined periods of time.
Conventionally, a content rating engine or a content classification engine may be installed in a firewall to screen contents coming into a system from an external network, such as email received and web pages retrieved from the Internet. The content rating engine may retrieve rating of the incoming contents from a rating database, if any, and/or attempt to rate the contents in real-time. To rate the content in real-time, the content rating engine may parse the contents and use a pattern matching engine to identify some predetermined keywords and/or tokens. Then the content rating engine may determine a rating for the contents based on the presence and/or absence of the keywords and/or tokens.
A conventional pattern matching engine typically adopts a specific pattern matching mechanism including some static rules, which may be suitable for one application, but not other applications. Since the rules are static, users may not change or update these rules to make the pattern matching engine more suitable for a different application. Thus, the pattern matching engine may not adapt to changes in the application and/or the circumstances.