Networking products can use pattern matching is used to identify types of data flows, where each data flow is a group of packets with similar characteristics. Once a data flow is identified, the networking product can apply a traffic policy to that identified data flow. A traffic policy determines how the data flow is to be communicated by the networking product. For example, the packets in the data flow could be dropped, have the bandwidth for that data flow restricted, have the bandwidth guaranteed, and/or apply some other know Quality of Service (QoS) policy.
To further complicate the problem, data flow can exhibit more than one pattern, and thus, multiple pattern matching has to be performed in order to successfully screen out these attacks. Such a collection of patterns is called a signature. For example, a data flow signature may contain a recognizable header and a particular phrase in the body. To detect such a data flow, the detection mechanism has to match all the patterns in the signature. If only part of the signature is matched, false positives may occur. As such, the term “pattern of interest” is used to refer to a single pattern or a signature.
When such data flows are transported over multiple packets, the contents, and therefore the recognizable patterns, may exist in payloads of different packets. In addition, a single pattern may be split over several packet payloads. Therefore, two problems have to be solved at the same time. On one hand, the traffic policy mechanism has to scan each pattern across multiple packet payloads, and on the other hand, the detection mechanism also has to scan across patterns. One existing approach is to reassemble all packets and scan for each pattern in sequence. This approach is inefficient in terms of processing time and memory usage because scanning cannot start until all packets are received and reassembled and extra memory is needed to store the packets received.
Another problem in pattern matching is that the packets may arrive out of order. Using Transport Control Protocol (TCP) packets as an example, the application data for these packets is broken into what TCP considers the best sized chunks to send, called a TCP segment. When TCP sends a segment, it maintains a timer and waits for the other end to acknowledge the receipt of the segment. The acknowledgement is commonly called an ACK. If an ACK is not received for a particular segment within a predetermined period of time, the segment is retransmitted. Since the Internet Protocol (IP) layer transmits the TCP segments as IP datagrams and the IP datagrams can arrive out of order, the TCP segments can arrive out of order as well. Currently, one receiver of the TCP segments reassembles the data if necessary, and therefore, the application layer receives data in the correct order.
An existing Traffic Policy System (TPS) that identifies data flows and enforces traffic policies on those data flows typically resides between the two ends of packet communication, inspecting the packets as the packets arrive at the TPS and apply traffic policies to those packets. The TPS looks for predetermined patterns in the payloads of the packets. These patterns are typically application layer patterns. For example, the pattern might be to look for the word “windows”. In this example and using TCP communication in the example, the word may be broken into two TCP segments, e.g., “win” in one segment and “dows” in another segment. If these two segments arrive in the correct order, then TPS can detect the word. However, if the segments arrive out of order, then the TPS may first receive the segment containing “dows”, and have to hold this segment and wait for the other segment. A typical approach is for the TPS to force the sender to re-transmit all the segments from the last missing one, hoping that the segments may arrive in order the second time. One disadvantage of this approach is the additional traffic in between and the additional processing on both ends of the TCP communication.
An additional challenge is that a client application may communicate in multiple data flows, such as using a control data flow to control the communication of data and a “data” data flow to communicate that data for the client application. A TPS will identify each data flow separately based on the data packets of that data flow and separately apply a traffic policy for that data flow.