Although applicable to any analyzation method, the present invention will be described with regard to deep packet inspection and deep flow inspection.
In general deep packet inspection enables analysis of an application layer content of a data packet, for example a packet transmitted via TCP/IP, to detect whether it contains patterns taken from a signature data base such as content strings, regular expressions or snort-type modifiers or the like.
However when a pattern spans over multiple packets within the same flow—such an example is shown in FIG. 1—an analysis of the content of one packet may not match the full expression.
To overcome this problem conventionally the entire flow is reconstructed by reassembling consecutive packets so that they are in-order and then a deep packet inspection on the reconstructed stream is applied and it is looked for matches in the entire flow which is also known under the term “deep flow inspection” DFI.
However one of the disadvantages is, that an application on large traffic volumes is infeasible since each flow has to be reconstructed in total prior to an inspection: A flow reconstruction chain, for example a thread in software implementations, for every flow crossing the link is required, thus draining computational resources.
Another disadvantage is, that when reassembling the flow, a state per each flow being reconstructed has to be explicitly maintained, thus draining memory resources. Those per-flow resources are reserved for the entire life time of the flow even if it experiences some inactivity within its lifetime. For example in the non-patent literature of A. Kortebi, L. Muscariello, S. Oueslati, J. Roberts, “Evaluating the number of active flows in a scheduler realizing fair statistical bandwidth sharing,” ACM SIGMETRICS 2005 or in the non-patent literature of C. Hu, Y. Tang, X. Chen, and B. Liu, “Per-flow Queueing by Dynamic Queue Sharing,” Proceedings of IEEE INFOCOM, Anchorage, Ak., 2007 it is shown that the number of flows exhibiting packet level activity during a given time window, between hundred of milliseconds up to some seconds, is significantly smaller than the number of flows which are in progress. This means that if a high speed link shares a total number of one million flows in practice, the number of active ones at a given time is in the order of some thousands, therefore a lot of memory is used for flows which are inactive.
Further in the non-patent literature of “Beyond bloom filters: from approximate membership checks to approximate state machines (SIGCOMM06)”, George Varghese et al., http://cseweb.ucsd.edu/˜varghese/PAPERS/sigcomm06a.pdf\, D-left tables for traffic analysis are shown, however, flow-state information is saved and insert/delete operations are used to update the status of a flow.
In the non-patent literature of “Bouma2-A Quasi-Stateless, Tunable Multiple String-Match Algorithm”, http://arxiv.org/abs/1209.4554, a quasi-stateless string matching algorithm is shown.