The present invention relates to security of data networks and specifically to a system and method for providing a high-speed firewall which protects networks while processing complex connections.
Firewall techniques involve using a set of rules to compare incoming data packets to a defined security policy. A firewall accepts and denies traffic between two or more network domains. In many cases, there are three domains where the first domain is an internal network such as in a corporate organization. Outside the internal network is a second network domain where both the internal network and the outside world have access, sometimes known as a “demilitarized zone” or DMZ. The third domain is the external network of the outside world.
A firewall regulates the flow of data packets. A packet includes a header and a payload. The header includes header information (i.e. header parameters), which typically includes a source and destination address, and source and destination port numbers, and a protocol. The payload includes data conveyed by the packet from its source to its intended destination. The firewall, which is situated between the source and destination, intercepts the packet. The firewall filters packets based upon header information and a rule previously loaded into the firewall. The rule correlates a pattern in the header of a packet with a prescribed action, either PASS or DROP or other actions like encrypt, perform network address translation (NAT) send a RESET packet, generate logs, and perform content inspection on the packet data. The filter identifies the rule that applies to the packet based upon the packet's header, and then implements the rule's prescribed action. When a DROP action is performed, the packet is blocked (deleted), and does not reach its intended destination. When a PASS action is performed, the packet is passed on toward its intended destination. The set of rules loaded into a firewall reflect a security policy, which prescribes what type of information is permissible to pass through the firewall, e.g., from which source, to which destination, and for which application.
To ensure sufficient capacity of a firewall, it is common to construct clusters which include typically a number of firewall nodes sharing a common network address, and connections are typically directed to the cluster by means of a cluster network address. Additionally, the nodes typically have node-specific addresses, e.g. MAC addresses. In a cluster, if the firewall nodes have a common cluster network address, all the firewall nodes read all data packets arriving at the cluster. Consequently, there has to be an arrangement for distinguishing which data packets belong to which node. Each node should process only those packets that are assigned to it and not receive or receive but ignore other data packets.
Connections directed to a cluster of network elements are directed to different nodes of the cluster on the basis of predefined distribution criteria. Frequently, distributing is done so that each firewall node filters all arriving data packets and decides on the basis of the header field(s) of the packet whether that particular node needs to process that particular packet. Frequently, specific sets of hash values are allocated to the nodes and a hash value for a data packet is calculated using a predetermined hash function and certain header fields of the data packet. Typically the header fields that are used for calculating hash values for TCP/IP (Transfer Control Protocol/Internet Protocol) or for UDP/IP (User Datagram Protocol/Internet Protocol) are source address, source port, destination address and destination port. When a data packet directed to the cluster network address arrives at the cluster, a hash value is calculated on the basis of some header fields of the data packet, and the resulting hash value defines which node processes the data packet. Typically, all nodes filter all arriving data packets by calculating hash values for them, and then decide on the basis of the hash values regarding which packets which belong to each node. Methods other than calculating a hash from the header connection information may be used for distributing the data packets.
Several prior art techniques are used to determine distribution of packets among the firewall nodes. Often, a return to sender (RTS) technique is used in which for each connection the load balancer learns the MAC address of the firewall node so that replies from servers are directed to the correct firewall. Sometimes the load between the firewall nodes is balanced statically without any dynamic adjustment of load between the firewall nodes. and/or a new connection is assigned according to a “round robin” technique distributing each new connection to the next firewall node in a queue without regard to the actual availability of the node. A simple query, such as a “ping” may be used and the time to respond to the “ping” is measured to roughly assess the availability of the firewall node.
According to U.S. Pat. No. 6,880,089, a firewall clustering system connects two or more firewalls between an internal network and an external network. Firewalls maintain client-server state information. Flow controllers are connected to the firewalls and placed on both the internal “trusted” side and the external “untrusted” side of the firewalls. Flow controllers are placed on both sides of the firewalls to ensure that traffic for a given client-server connection flows through the same firewall in both inbound and outbound directions. The firewalls perform filtering operations and/or network address translation (NAT) services.
According to a method disclosed in US patent application publication 20030002494, node-specific lists of connections are maintained which specify for which connections each node of a firewall is responsible. A data packet, which initiates opening of a new connection, is processed in a node determined by a distribution decision according to predetermined distribution criteria. The first data packets are thus distributed to the cluster nodes. A data packet, which relates to an opened packet data connection, is processed in that node in whose connection list the opened packet data connection is specified. Changing the distribution criteria is required when load is not in balance between the nodes, or when a new node is added or removed to/from the cluster.
The prior art addresses load balancing between firewall nodes for “simple connections”. An important requirement from a load balancing device is to maintain connection stickiness, so that all packets belonging to the same connection will be forwarded to same firewall. Standard load balancers available on the market today can provide connection stickiness for simple connection types. However, for complex connections, such as when control and data are on different connections, e.g. FTP or voice over IP connections when NAT is applied, the NAT information is inserted into the payload, the prior art method load balancing systems and methods are not appropriate, and different firewalls may be processing different packets of the same complex connection.
There is thus a need for, and it would be highly advantageous to have a system and method in which one or more firewalls of a firewall cluster manages the load balancer specifically by inspecting the content, i.e. payload, of packets of a complex connection, and directs a switch regarding expected connections related to the complex connection.