1. Field of the Invention
This invention relates generally to content and context based analysis of messages and documents in the area network security infrastructure and web services, more specifically to microprocessors configured to provide security through content based evaluation of incoming messages for packet based networks.
2. Description of the Related Art
Network bandwidth has dramatically increased to support the gigabit speeds enabling the enterprise systems and high volume electronic commerce (e-commerce) sites associated with the advent of the Internet. However, security systems configured to protect these networks from internal or external attacks have not developed in either sophistication or speed to provide adequate protection.
Intrusion detection systems (IDS) for high bandwidth packet based networks provide security by analyzing the wrapper or header of a message. However, a move is underway to provide security by looking at the actual content of the message rather than looking at the network layer header information or through encryption for packet based networks. Accordingly, devices in the middle of the network must understand the content of a message in the context of a sequence of message transactions to provide adequate security from hackers or insiders. FIG. 1 is a simplified schematic diagram of the security infrastructure for an enterprise system. External client 100 communicates with server 114 through distributed network 102, such as the Internet. Access to network 102 may be provided by an Internet service provider (ISP). Server side 104 includes middle devices such as, firewall 106, router 108 and switch 110. Switch 110 is in communication with server 114. Server 114 has access to database 116. Alternatively, the data path may proceed through IDS 112 through switch 114. It should be appreciated that it is desired to protect server 114 from internal clients, such as client 103, which may be used to hack into the server. Currently, there are not many available protections from internal clients. One skilled in the art will appreciate that the architecture of the security infrastructure can vary, however, each of the architectures employ some type of IDS for security, i.e., some type of architecture incorporating the middle devices described above.
One of the shortcomings with the intrusion detection systems typically employed to provide security for gigabit speed networks is that the IDS works at the packet level only and can not handle the Internet Protocol (TCP/IP) traffic fast enough to provide adequate protection as the network speed increases. As the intrusion detection system reaches its maximum processing capacity, a large number of packets are dropped. Consequently, the possibility of missing attacks significantly increases due to the dropped packets. Additionally, current intrusion detection systems may be overwhelmed by hacker tools that generate numerous suspicious events so that a hacker may sneak through the system. These tools can also cause the IDS to completely break down. Furthermore, when looking at the packet by packet information, only pieces of a message are being looked at. Thus, the pieces may get through the IDS separately and then be reassembled downstream to execute an unwanted intrusion. Firewalls do not cure the deficiencies of the IDS, because packets such as web traffic, i.e., traffic transferred via hypertext transfer protocol (HTTP), are generally allowed to pass through the firewall. Enterprise networks are actually being configured to include IDS's, without addressing any of the deficiencies.
Another shortcoming of the intrusion detection systems is that they analyze the encapsulation of the transmitted data, e.g., packet headers for packet based protocols, to detect attack signatures. Providing security through detection of the attack signatures leaves the system vulnerable to newly developed attack signatures constantly being thrown at networks by hackers. Thus, the content of the packet is unknown to the IDS. Additionally, systems based on regular expression searches, that are typically performed on level 7 (L7) string signatures, have limited capabilities with respect to content based evaluation for the IDS. Because of the limitations of regular expression searches many false positives are generated. For example, instructions for finding a .exe file in a GET request will generate matches for .exel files, exempt files, and .exe in parameters or comments of files. Even as the regular expression is refined further to handle the .exe files false positives still occur and real intrusions become buried among a large number of incorrect alerts that can be used purposely by a hacker to create and leverage security holes. Moreover, as the number of regular expression rules increase the memory requirements significantly expand, e.g., for 500,000 regular expression rules more than 1 gigabyte of memory may be required. Furthermore, if one of the rules changes the entire gigabyte+of memory for the regular expression rules must be rewritten, which requires that the system be brought down for some period of time. Furthermore, the processors supporting the IDS's can not analyze the grammar and contextual information or handle state information which has to be maintained across sessions that is required to create robust content based security devices. In short, current processors are unable to handle the in-line processing demands posed by content based security systems.
FIG. 2 is a simplified schematic diagram of the configuration of an in-line content based intrusion detection and prevention system for a network. Transmission control protocol (TCP)/IP forwarding 130 is provided by hardware associated with network boxes. For example, the network boxes may include hardware such as, switches, routers, appliances, etc. Software provides the functionality for sockets interface (I/F) 132, special purpose analysis software 134, and control configuration 136. Under the configuration illustrated in FIG. 2, the TCP/IP connection is terminated and the packet is transmitted to socket I/F 132. Special purpose analysis software 134 examines the packet headers and control configuration 136 determines which rules to apply based on the packet header examination. Special purpose analysis software 134 becomes a bottleneck in the processing and is not able to keep up with the gigabit speed networks.
Web services (and resource virtualization) are increasingly taking over the computing paradigm and enterprise application with the advent of such architectures as MICROSOFT.NET, grid computing, and peer-to-peer networking. Each of these stress the processing resources of the network infrastructure as they require devices in the network to understand the content of messages and documents embedded within them. Use of XML and meta-data allow for efficient routing to appropriate servers as well as correct visualization by a client based on interpreting the message content.
As a result, there is a need to solve the problems of the prior art to provide a method and apparatus that allows for the evaluation of the content of a message based on the grammar that generated the message, and which will be used by the server to understand the message, and simultaneously minimizes the false positives generated by current systems. In addition, a processor or processing device configured to support a content based security system is needed.