Communication over a network often requires the information that is to be transported from one computer to another be divided into network communication packets. These network communication packets, simply referred to as “packets”, are transported across the physical communication network.
The information originating from an application program becomes packetized into network communication packets by passing through various software components before arriving at the network interface card for transmission on the physical communications network. These software components are typically layered to form what is known as the network protocol stack. Each layer is responsible for a different facet of communication. For example, the TCP/IP protocol stack is normally split into four layers: link, network, transport and application. FIG. 1 shows the relationship between the protocol layers and the TCP/IP protocol stack. The link layer 101 is responsible for placing data on the physical network. The network layer 102 is responsible for routing. The transport layer 103 is responsible for the communication between two hosts. The application layer 104 is responsible for processing the application specific data.
For example, FIG. 2 illustrates the stages of an HTTP request being encapsulated before being sent to a web server. As the request descends the protocol stack, each layer 201–204 encapsulates the packet adding its own header. When the HTTP packet arrives at the destination address, each protocol layer uses information within its header to classify the incoming packet amongst all the protocols in the layer above it. This process is commonly referred to as demultiplexing.
At each layer in the network protocol stack, the packet is demultiplexed or “classified” based on information about the packet that is contained in the headers or from information inside the data portion of the packet itself. The packet is processed differently based on its classification.
For example, FIG. 3 illustrates how this classification is done for an incoming HTTP request 301. The Ethernet driver 302, in the link layer 300, classifies the packet based on frame type in the Ethernet header and passes it to IPv4 312 in the network layer 310. IPv4 312 classifies the packet based on the IP header protocol value in the IP header and passes it to TCP 323 in the transport layer 320. TCP classifies the packet based on the destination port number in the TCP header and passes it to the HTTP server 332 in the application layer 330.
Traditional packet classification systems, as found in BPF, DPF, Pathfinder, Router Plugins, operating systems and many firewalls, are limited to a set of fixed pattern matching rules. This allows a user to intercept/process any packet that matches the desired set of values in the appropriate byte ranges (usually a combination of the IP and the protocol header fields, such as source/destination address, protocol or source/destination ports). These packets are then passed to a software module that processes the packets and can modify, forward, drop or delay them. Stateful packet filtering systems generally have the ability to generate and add rules dynamically based on application traffic. However, such systems do not provide simple methods to extend packet processing to understand new application protocols.
These traditional systems may work well for applications that use a single connection to a well known destination address and port. However, many modern applications initially use a well known service port for the control session and then use additional connections on ephemeral port numbers for each data stream. Examples of such applications are FTP, Real Audio and H.323. To support these applications efficiently, the traditional systems must allow packet matching filter rules to be updated dynamically and quickly. In addition, some modern protocols have abandoned using fixed format headers and fixed sized fields. For example, HTTP makes its header human readable by encoding them as strings.