It is known that computer networks can utilize data packets to deliver data through communication protocols. Data or information sent utilizing a communication protocol is broken, packetized, and then sent one-by-one through the network. The data packets are then received and reassembled (depacketized) by the intended recipient. The message boundaries between the data packets can either be set automatically or requested by the software application. Currently, the two most common types of data communication protocols include User Datagram Protocol (UDP), which defines and recognizes explicit message boundaries and Transmission Control Protocol (TCP), which does not utilize message boundaries and is also called a stream oriented protocol.
For example, when utilizing TCP, a payroll application does not mark the boundaries between employee records or identify the contents of a data stream as being payroll data. TCP views the data stream as a sequence of octets or bytes that it divides into data packets for transmission between TCP computer systems each having at least one computer. The TCP data packet (segment) is a unit of transfer between two computers or computer systems. The TCP system packetizes and sends data one packet (segment) at a time in a variety of sizes. Consequently, a data stream can be broken-up and delivered over a computer network in such a manner that can make it difficult, if not impossible, for a software application to process the data without having to reassemble several consecutive data packets, or all of the data packets, that belong to a single communication protocol connection. When data packets are separated and independently transmitted, additional problems are created since the data packets can get lost, dropped, and even arrive out-of-order.
There are content search engines, content filters, virus detectors, and worm detectors that look for a particular pattern or signature in a data stream by looking at each individual data packet one at a time. In using these search methods, a pattern or signature can be undetected if it is broken-up into several pieces and distributed in more than one data packet. This can occur either by coincidence or maliciously by someone that wants to evade detection.
The firewalls that are currently being utilized to filter data packets do so based on specific entries of the header that are configured statically based on well-known port numbers and/or addresses to allow specific software applications to send and receive the data. However, if the same software application dynamically, rather than statically preconfigured, negotiates further to use other port numbers and/or communication protocol addresses, then the firewall needs to deeply inspect each data packet so that negotiated port numbers and addresses are recognized properly. This deep packet inspection requires the system to first logically track the connection state by recognizing specific entries of headers at multiple levels, e.g., L3-L7, as well as data payloads in a deep inspection that also reviews the communication protocol connection state. Second, there is a requirement to selectively search data or patterns, whose eligibility is determined by a particular communication protocol connection state. Consequently, deep packet inspection requires packet reassembly and total control of the reassembly process. Regardless of the stream of data or pattern being searched, computer systems that depend on data packets for delivery need to reassemble corresponding data packets in sequential order to fully comprehend the logical nature and structure of the traffic.
Therefore, the packet reassembly process is resource driven but is not necessarily complicated. Nevertheless, the process is time consuming, requires computing power, and can potentially be vulnerable to Denial of Service (DoS) attacks. Therefore, it is important to find an effective and secure method to reassemble data packets. The present invention is directed to overcoming one or more of the problems set forth above.