1. Field of the Invention
The present invention generally relates to parsing of documents such as an XML™ document and, more particularly to parsing a document or other logical sequence of network data packets for detecting potential intrusion or an attack on a node of a network.
2. Description of the Prior Art
The field of digital communications between computers and the linking of computers into networks has developed rapidly in recent years, similar, in many ways to the proliferation of personal computers of a few years earlier. This increase in interconnectivity and the possibility of remote processing has greatly increased the effective capability and functionality of individual computers in such networked systems. Nevertheless, the variety of uses of individual computers and systems, preferences of their users and the state of the art when computers are placed into service has resulted in a substantial degree of variety of capabilities and configurations of individual machines and their operating systems, collectively referred to as “platforms” which are generally incompatible with each other to some degree particularly at the level of operating system and programming language.
This incompatibility of platform characteristics and the simultaneous requirement for the capability of communication and remote processing and a sufficient degree of compatibility to support it has resulted in the development of object oriented programming (which accommodates the concept of assembling an application as well as data as a group of more or less generalized modules through a referencing system of entities, attributes and relationships) and a number of programming languages to embody it. Extensible Markup Language™ (XML™) is such a language which has come into widespread use and can be transmitted as a document over a network of arbitrary construction and architecture.
In such a language, certain character strings correspond to certain commands or identifications, including special characters and other important data (collectively referred to as control words) which allow data or operations to, in effect, identify themselves so that they may be, thereafter treated as “objects” such that associated data and commands can be translated into the appropriate formats and commands of different applications in different languages in order to engender a degree of compatibility of respective connected platforms sufficient to support the desired processing at a given machine. The detection of these character strings is performed by an operation known as parsing, similar to the more conventional usage of resolving the syntax of an expression, such as a sentence, into its component parts and describing them grammatically.
When parsing an XML™ document, a large portion and possibly a majority of the central processor unit (CPU) execution time is spent traversing the document searching for control words, special characters and other important data as defined for the particular XML™ standard being processed. This is typically done by software which queries each character and determines if it belongs to the predefined set of strings of interest, for example, a set of character strings comprising the following “<command>”, “<data=dataword>”, “<endcommand>”, etc. If any of the target strings are detected, a token is saved with a pointer to the location in the document for the start of the token and the length of the token. These tokens are accumulated until the entire document has been parsed.
The conventional approach is to implement a table-based finite state machine (FSM) to search for these strings of interest. The state table resides in memory and is designed to search for the specific patterns in the document. The current state is used as the base address into the state table and the ASCII representation of the input character is an index into the table. For example, assume the state machine is in state 0 (zero) and the first input character is ASCII value 02, the absolute address for the state entry would be the sum/concatenation of the base address (state 0) and the index/ASCII character (02). The FSM begins with the CPU fetching the first character of the input document from memory. The CPU then constructs the absolute address in the state table in memory corresponding to the initialized/current state and the input character and then fetches the state data from the state table. Based on the state data that is returned, the CPU updates the current state to the new value, if different (indicating that the character corresponds to the first character of a string of interest) and performs any other action indicated in the state data (e.g. issuing a token or an interrupt if the single character is a special character or if the current character is found, upon a further repetition of the foregoing, to be the last character of a string of interest).
The above process is repeated and the state is changed as successive characters of a string of interest are found. That is, if the initial character is of interest as being the initial character of a string of interest, the state of the FSM can be advanced to a new state (e.g. from initial state 0 to state 1). If the character is not of interest, the state machine would (generally) remain the same by specifying the same state (e.g. state 0) or not commanding a state update) in the state table entry that is returned from the state table address. Possible actions include, but are not limited to, setting interrupts, storing tokens and updating pointers. The process is then repeated with the following character. It should be noted that while a string of interest is being followed and the FSM is in a state other than state 0 (or other state indicating that a string of interest has not yet been found of currently being followed) a character may be found which is not consistent with a current string but is an initial character of another string of interest. In such a case, state table entries would indicate appropriate action to indicate and identify the string fragment or portion previously being followed and to follow the possible new string of interest until the new string is completely identified or found not to be a string of interest. In other words, strings of interest may be nested and the state machine must be able to detect a string of interest within another string of interest, and so on. This may require the CPU to traverse portions of the XML™ document numerous times to completely parse the XML™ document.
The entire XML™ or other language document is parsed character-by-character in the above-described manner. As potential target strings are recognized, the FSM steps through various state character-by-character until a string of interest is fully identified or a character inconsistent with a possible string of interest is encountered (e.g. when the string is completed/fully matched or a character deviates from a target string). In the latter case, no action is generally taken other than returning to the initial state or a state corresponding to the detection of an initial character of another target string. In the former case, the token is stored into memory along with the starting address in the input document and the length of the token. When the parsing is completed, all objects will have been identified and processing in accordance with the local or given platform can be started.
Since the search is generally conducted for multiple strings of interest, the state table can provide multiple transitions from any given state. This approach allows the current character to be analyzed for multiple target strings at the same time while conveniently accommodating nested strings.
It can be seen from the foregoing that the parsing of a document such as an XML™ document requires many repetitions and many memory accesses for each repetition. Therefore, processing time on a general purpose CPU is necessarily substantial. A further major complexity of handling the multiple strings lies in the generation of the large state tables and is handled off-line from the real-time packet processing. However, this requires a large number of CPU cycles to fetch the input character data, fetch the state data and update the various pointers and state addresses for each character in the document. Thus, it is relatively common for the parsing of a document such as an XML™ document to fully pre-empt other processing on the CPU or platform and to substantially delay the processing requested.
It has been recognized in the art that, through programming, general-purpose hardware can be made to emulate the function of special purpose hardware and that special purpose data processing hardware will often function more rapidly than programmed general purpose hardware even if the structure and program precisely correspond to each other since there is less overhead involved in managing and controlling special purpose hardware. Nevertheless, the hardware resources required for certain processing may be prohibitively large for special purpose hardware, particularly where the processing speed gain may be marginal. Further, special purpose hardware necessarily has functional limitations and providing sufficient flexibility for certain applications such as providing the capability of searching for an arbitrary number of arbitrary combinations of characters may also be prohibitive. Thus, to be feasible, special purpose hardware must provide a large gain in processing speed while providing very substantial hardware economy; requirements which are increasingly difficult to accommodate simultaneously as increasing amounts of functional flexibility or programmability are needed in the processing function required.
In this regard, the issue of system security is also raised by both interconnectability and the amount of processing time required for parsing a document such as an XML™ document. On the one hand, any process which requires an extreme amount of processing time at relatively high priority is, in some ways, similar to some characteristics of a denial-of-service (DOS) attack on the system or a node thereof or can be a tool that can be used in such an attack.
DOS attacks frequently present frivolous or malformed requests for service to a system for the purpose of maliciously consuming and eventually overloading available resources. Proper configuration of hardware accelerators can greatly reduce or eliminate the potential to overload available resources. In addition, systems often fail or expose security weaknesses when overloaded. Thus, eliminating overloads is an important security consideration.
Further, it is possible for some processing to begin and some commands to be executed before parsing is completed since the state table must be able to contain CPU commands at basic levels which are difficult or impossible to secure without severe compromise of system performance. In short, the potential for compromise of security would be necessarily reduced by reduction of processing time for processes such as XML™ parsing but no technique for significantly reducing the processing time for such parsing has been available.
Many security systems rely on the ability to detect an attempted security breach at a very early stage and a security breach may be difficult or impossible to interrupt quickly or through programmed intervention, once begun. For example, a highly secure system has been proposed and is disclosed in U.S. patent applications Ser. Nos. 09/973,769 and 09/973,776, both assigned to the assignee of the present application. These applications disclose a system having two levels of internodal communications, one at very high speed, by which a node at which a possible attack or intrusion is detected can be compartmentalized and then automatically repaired, if necessary, before reconnection to the network. Acceleration of parsing therefore supports early response to a potential attack and is particularly advantageous in a system such as that disclosed in the system described in the above-incorporated patent applications since an appropriate control of the network can be initiated as an incident of parsing and can thus be initiated at an earlier time if parsing can be significantly accelerated. Proper network control, initiated in a timely fashion in response to a detection alert can effect intrusion prevention in addition to intrusion detection.