The IP security protocol (IPSEC) is being standardized by the IETF (Internet Engineering Task Force) for adding security to the well known and widely used IP protocol. It provides cryptographic authentication and confidentiality of traffic between two communicating network nodes. It can be used in both end-to-end mode, i.e. directly between the communicating nodes or hosts, or in tunnel mode between firewalls or VPN (Virtual Private Network) devices.
Asymmetric connections, where one end is a host and the other end is a firewall or VPN are also possible.
IPSEC defines a set of operations for performing authentication and encryption on packet level by adding new protocol headers to each packet. IPSEC authentication of a data packet is performed by computing an authentication code over all data and most of the header of the data packet. The authentication code further depends on a secret key, known only to the communicating parties. The authentication code is then stored in the packet, appropriately wrapped in a well-defined header or trailer.
The operations to be performed on each packet are controlled by a policy that specifies which authentication and encryption methods, jointly called transforms, are to be applied between each pair of communicating hosts. The parameters specifying the cryptographic algorithms, cryptographic keys, and other data related to securely processing packets between two hosts or peers is called a security association.
A certain policy is typically expressed as a set of policy rules, each rule specifying a set of selectors (such as source IP address, destination IP address, subnet, protocol, source port number, destination port number) constraining the communication or set of communications to which the rule applies. Several rules may apply to a particular packet, and there is normally a mutual order of the rules so that a single rule can be unambiguously chosen for each incoming and outgoing packet.
The IPSEC standard and published implementations present a data structure called the policy database, which is an array or table in memory and contains rules. The rules in the policy database are consulted for each packet to be processed. FIG. 1 is a simplified graphical illustration of a known IPSEC implementation 100, which contains a policy database 101 and separates a secure internal network 102 from the Internet network 103. For the sake of simplicity, packets flowing only to one direction (outgoing packets) are considered. The input packets 104 in FIG. 1 contain data that a user in the internal network 102 wants to send to another user through the Internet and that need to be processed for authentication and encryption. In the IPSEC implementation an input packet under consideration 105 is transformed into an output packet under consideration 106 by consulting the rules in the policy database 101. The transformed output packets 107 are then sent into the Internet, where they will be properly routed to the correct receiving user.
On the other hand, mechanisms for filtering IP packets have been available and well-known in the literature for a long time. Suitable mechanisms are presented for example in J. Mogul, R. Rashid, M. Accetta: The Packet Filter: An Efficient Mechanism for User-Level Network Code In Proc. 11th Symposium on Operating Systems Principles, pp 39-51, 1987 and Jeffrey Mogul: Using screens to implement IP/TCP security policies, Digital Network Systems Laboratory, NSL Network, Note NN-16, July 1991. Packet filters also have a set of rules, typically combined by a set of implicit or explicit logical operators.
The task of a packet filter is to either accept or reject a packet for further processing.
The logical rules used in packet filters take the form of simple comparisons on individual fields of data packets. Effectively, effecting a such comparison takes the form of evaluating a boolean (logical, truth value) expression. Methods for evaluating such expressions have been well-known in the mathematical literature for centuries. The set of machine-readable instructions implementing the evaluations is traditionally called the filter code. FIG. 2 illustrates a packet filter 200 with a stored filter code 201. Input packets 202 are examined one packet at a time in the packet filter 200 and only those packets are passed on as output packets 203 that produce correct boolean values when the logical rules of the filter code are applied.
The individual predicates (comparisons) of packet filter expressions typically involve operands that access individual fields of the data packet, either in the original data packet format or from an expanded format where access to individual fields of the packet is easier. Methods for accessing data structure fields in fixed-layout and variable-layout data structures and for packing and unpacking data into structures have been well-known in standard programming languages like fortran, cobol and pascal, and have been commonly used as programming techniques since 1960's.
The idea of using boolean expressions to control execution, and their use as tests are both parts of the very basis of all modern programming languages, and the technique has been a standard method in programming since 1950's or earlier.
Expressing queries and search specifications as a set of rules or constraints has been a standard method in databases, pattern matching, data processing, and artificial intelligence. There are several journals, books and conference series that deal with efficient evaluation of sets of rules against data samples. These standard techniques can be applied to numerous kinds of data packets, including packets in data communication networks.
A further well-known technique is the compilation of programming language expressions, such as boolean expressions and conditionals, into an intermediate language for faster processing (see, for example, A. Aho, R. Sethi, J. Ullman: Compilers-Principles, Techniques, and Tools, Addison-Wesley, 1986). Such intermediate code may be e.g. in the form of trees, tuples, or interpreted byte code instructions. Such code may be structured in a number of ways, such as register-based, memory-based, or stack-based. Such code may or may not be allowed to perform memory allocation, and memory management may be explicit, requiring separate allocations and frees, or implicit, where the run time system automatically manages memory through the use of garbage collection. The operation of such code may be stateless between applications (though carrying some state, such as the program counter, between individual intermediate language instructions is always necessary) like the operation of the well-known unix program "grep", and other similar programs dating back to 1960s or earlier. The code may also carry a state between invocations, like the well-known unix program "passwd", most database programs and other similar applications dating back to 1960s or earlier. It may even be self modifying like many Apple II games in the early 1980s and many older assembly language programs. It is further possible to compile such intermediate representation into directly executable machine code for further optimizations. All this is well-known in the art and has been taught on university programming language and compiler courses for decades. Newer well-known research has also presented methods for incremental compilation of programs, and compiling portions of programs when they are first needed.
Real-time filtering of large volumes of data packets has required optimization in the methods used to manipulate data. Thus, the standard programming language compilation techniques have been applied on the logical expression interpretation of the rule sets, resulting in intermediate code that can be evaluated faster than the original rule sets. A particular implementation of these well-known methods used in the BSD 4.3 operating system has been mentioned in popular university operating system textbooks and has been available in sample source code that has been accessible to students in many universities since at least year 1991.
Recently, a patent application was filed for the well-known methods of stateless and stateful filtering; a US patent was consequently granted with the U.S. Pat. No. 5,606,668. "Stateless filtering" is essentially the well-known BSD 4.3 packet filter (University of California, Berkeley, 1991, published for royalty-free worldwide distribution e.g. in the 4.3BSD net2 release), and "stateful filtering" adds to that what Aho, Sethi, and Ullman (1987, above) characterize on page 401 with "This property allows the values of local names to be retained across activations of a procedure. That is, when control returns to a procedure, the values of the locals are the same as they were when control left the last time." The particular book was probably the most popular textbook in university undergraduate compiler courses in late 1980s and early 1990s. The idea itself dates decades back.
A known IPSEC implementation must consult the policy database for every packet transmitted through the IPSEC implementation. High-speed VPN (Virtual Private Network) implementations, for instance, may need to process tens of thousands of packets per second. The processing overhead consisting of looking up for policy rules from a database for such packets will soon become a bottleneck of performance.