1. Technical Field
The present invention relates to computer networks in general and, in particular, to design and operation of firewalls. It includes description of efficient hash functions that map packet header keys into a firewall connection table, thereby increasing the capacity of the table.
2. Prior Art
The worldwide web (WWW) better known as the Internet is fast becoming the premier computer network for communicating both private and public information. The Internet is an open network that can be accessed by anyone using primarily a protocol called TCP/IP (Transmission Control Protocol/Internet Protocol) or other protocols. Because of its openness computers on private networks (intranets) are susceptible to malicious attacks by hackers. Computers have become the main instrument of communication for business and government agencies. For example, many business and government agencies use computers and computer networks to link remote offices, share data and other resources among employees within an office or campus, communicate with customers via electronic mail, reach new customers via electronic mail, provide information via web sites, etc.
Because businesses, governments and individuals rely heavily on computers and the Internet, malicious attacks could result in catastrophic economic loss or embarrassment. As a consequence computer security has become a major concern of business, government and individuals using the computer as a major communication vehicle.
A firewall is a set of logical functions, mainly related to security, that are implemented on a box in a computer network. The firewall may run on a dedicated electronic device, as a set of functions that complement other functions on a box such as a router, as a set of functions on a server, laptop, or workstation, or on some other network device. Firewalls may keep a table of labels of packets known to be part of a stream of packets in a TCP session (many packets that comprise a communication). Such a connection table may reduce the workload of a firewall or increase its performance in the following way. Often, when a TCP session starts, firewall software is called into play to analyze the initial packets. The analysis may yield a decision about whether or not to permit the session to continue in light of security policies. If a decision is reached, then the header values common to all packets of the session may be stored in memory together with the decision. In this way, it is not necessary for firewall software to be called over and over for every subsequent packet of a session. Rather, the packet header key may be sought in the connection table, and, if found, a stored action or decision enforced.
The connection table may be considered to be within a firewall accelerator, meaning a set of functions that enhance the speed or performance of a firewall.
Modern communications may include analysis of many thousands of TCP sessions at one point in a network. If a connection table is to be used as above, then it may happen that the large number of connections sometimes occurring will exceed the storage capacity of the table. It is desirable, therefore, to make efficient use of the table. The goal is to map all the packets of one session (with one, common action) to the smallest number of distinct table entries or slots.
Operation of connection tables can be complicated by the use of Network Address Translation (NAT). NAT is described by the Internet Engineering Task Force (IETF) in a Request For Comment (RFC) number 3022. RFC 3022 is available at http://www.ietf.org/rfc/rfc3022.txt?number=3022
NAT may change some header values in the packets of one session. This may make mapping all the packets of one session to one table slot variable according to the various means of applying NAT in a network.
A hash function is a mathematical function applied to the distinguishing header values of a packet. The input therefore is the ordered concatenation of bits from one or more packet headers (typically four header fields, as explained below). The output of a hash function is generally a smaller number of bits. The smaller number can be used as an index or label of a table slot.
When a packet arrives at the network device containing the firewall function, it must be recognized. To accomplish lookup of a packet in the connection table, a hash function is applied to its headers, collectively constituting a key. The hash function may be simple (selection of some key bits) or complex (a mathematical function applied to some or all key bits). The value of the hash function is an index into the lookup table. Each slot in the table is indexed, for example, by using all the binary numbers of length 16 from 0000000000000000 through 1111111111111111.
The index (hash function output) derived from an item may point to a memory location with zero, exactly one, or more than one stored (cached) memory. Since the table slot is found by direct application of the hash function, the table is called a Direct Table (DT).
If the DT memory location has stored zero memories, then there is a miss and a new memory with new action must be added to the lookup system. If there is exactly one stored memory for the table slot, then the table points to the one stored memory. The full key is then compared to a full key stored value. If there is a match, then the action stored with the memory is applied. If there is not a match, then there is a miss. Again, in case of a miss, the new memory and its new action must be added to the lookup mechanism. If there are two or more memories with the hit DT slot index, then the full key of the item may be analyzed by an attached Patricia tree (see D. Knuth, The Art of Computer Programming, Addison-Wesley, Reading Mass., 2nd ed, 1998, vol 3, p 498). The Patricia tree is attached in the sense that the DT slot contains a pointer to it. The Patricia tree contains at least one branch. Also, the two or more memories appear as leaves of the Patricia tree. The Patricia tree tests key bits until at most one stored memory might fit the item. The full item key is then compared with the stored key in memory. If there is a match, then the stored action is applied. If there is not a match, then there is again a miss. Then the key and its action may be stored as a new memory in the connection table.
Prior art includes using different hash values for related packets that have different direction and different NAT processes, even though many action types would be common to all. This would consume in general a different table slot for each combination. Therefore, an alternate technique to map the closely related keys of one session into a common table slot and Patricia tree leaf is needed.