Keyword searching techniques are used in a wide variety of applications. For example, in some applications, communication traffic is analyzed in an attempt to detect keywords that indicate traffic of interest. Some data security systems attempt to detect information that leaks from an organization network by detecting keywords in outgoing traffic. Intrusion detection systems sometimes identify illegitimate intrusion attempts by detecting keywords in traffic.
Various keyword searching techniques are known in the art. For example, Aho and Corasick describe an algorithm for locating occurrences of a finite number of keywords in a string of text, in “Efficient String Matching: An Aid to Bibliographic Search,” Communications of the ACM, volume 18, no. 6, June, 1975, pages 333-340, which is incorporated herein by reference. This technique is commonly known as the Aho-Corasick algorithm. As another example, Yu et al. describe a multiple-pattern matching scheme, which uses Ternary Content-Addressable Memory (TCAM), in “Gigabit Rate Packet Pattern-Matching using TCAM,” Proceedings of the 12th IEEE International Conference on Network Protocols (ICNP), Berlin, Germany, Oct. 5-8, 2004, pages 174-183, which is incorporated herein by reference.
Other string matching algorithms are described, for example, by Navarro and Raffinot, in “Flexible Pattern Matching in Strings—Practical On-Line Search Algorithms for Texts and Biological Sequences,” Cambridge University Press, 2002, which is incorporated herein by reference. Chapter 3 of this book reviews multiple string matching algorithms such as the Wu-Manber (WM) and the Set Backward Oracle Matching (SBOM) algorithms.