1. Field of the Invention
The present invention generally relates to data classification systems and processing and, more particularly, to a string matching method and system for uniform data classification.
2. Description of the Related Art
String matching methods are widely used in systems such as intrusion detection systems, virus detection systems, and data mining systems. To detect an intrusion or a virus, a local system utilizes a matching method to search a received message for any of a predetermined set of strings, and treats the message accordingly based on the matching results. Each of the messages and the predetermined set of strings may include a number of characters or symbols. The received message may be referred to as the text, the predetermined set of strings may be referred to as a patterns set, and each member of the patterns set may be referred to as a pattern. A match is said to occur if a pattern is identical to a substring of the text.
For instance, virus detection systems can provide detection of potentially harmful data being input into data systems. A signature database is provided with a plurality of character strings that are considered harmful to data. A packet of input data is directed to the signature database for comparison with the stored character strings. After string matching, the virus detection system determines whether the input data packet is considered harmful. Remedial actions are accordingly taken if the input data packet is determined to be harmful.
Conventional string matching requires one-to-one comparison between strings stored in the signature database with each input data packet. Delay is likely to result when a large number of data packets are being input into the system, or if a large number of potentially harmful strings are stored in the signature database. System and processor resources required for implementing the string matching become unduly large if a large number of strings will need to be compared.
There is thus a general need for a system and method overcoming at least the aforementioned shortcomings in the art. A particular need exists in the art for a system and method overcoming disadvantages with respect to inefficiencies and delay in one-to-one string matching.