The use of networked computer systems in e-commerce, distributed data processing, and many other areas is constantly growing. Valuable data, ranging from confidential information to countless financial transactions, are constantly passed over networks. Unfortunately, as the importance of networked transactions increases, so do the associated risks. As networks and networked systems expand, so do the possibilities for both random errors and damage wrought by hackers, thieves, and spies. Accordingly, it is important for networked systems to have effective intrusion detection systems, firewalls, network monitors, and other systems to help preserve data security and integrity.
Application-level protocol analyzers are becoming increasingly important components of network security and integrity systems. Application-level protocol analyzers monitor communications to an application in real time or analyze recorded traces of network communications to construct the protocol context of communication sessions. Because it may not be possible to determine from the content of message whether it represents normal or malicious traffic, protocol analyzers assess the message content and context. Protocol analyzers translate a datastream into messages, group them into sessions, and model state transitions in a protocol state machine to evaluate whether a particular message presents a problem.
The protocol context for an application can be conceived as a particular path or traversal of the states the application can be anticipated to assume based on communications received over a network. Thus, the application states can be represented by a state machine, and a protocol context constitutes a path through the state machine, as illustrated in FIG. 1.
FIG. 1 illustrates a state machine 100 for a simple application. The application represented by the state machine 100 transitions from one state to another as a result of messages received over a network and how they are processed by message handlers. More specifically, FIG. 1 shows an exemplary protocol context 102, represented by a dotted line, representing a response of the application represented by the state machine 100 as a result of receiving various messages and how the messages are processed by a number of handlers.
The protocol context 102 begins at an initial state, WaitingforMsg1 104. Upon arrival of Msg1 106, the context 102 progresses to Msg1_Handler 108 to process the message. The content of the Msg1 106 will determine whether the Msg1_Handler 108 causes the system to transition to either a WaitforMsg2 state 110 or a WaitforMsg3 state 116. As indicated by the protocol context 102, based on the content of the Msg1 106, the Msg1_Handler causes the system to transition to the WaitforMsg2 state 110.
When a Msg2 112 is received, a Msg2_Handler 114 processes the Msg2 112 and causes the system to transition back to the WaitingforMsg1 state 104. Once back at the WaitingforMsg1 state 104, another Msg1 106 is received. Based on the content of the Msg1 106, the Msg1_Handler 108 this time causes the system to transition to the WaitforMsg3 state 116. While in the WaitforMsg3 state 116, a Msg3 118 is received. A Msg3_Handler 116 processes Msg3 118, resulting in the system transitioning to a Final state 122, where the protocol context 102 ends.
The exemplary protocol context 102 illustrated in FIG. 1 is a simple example, in a system having only three message types, three handlers, four states, and a few possible different state transitions based on the messages and responses of the handlers to those messages. As is well understood, modeling an application may involve a large number of message types, states, handlers, and transitions. Thus, developing a protocol analyzer capable of monitoring numerous message types, states, handlers, and transitions may prove very difficult.
Further complicating the process of protocol context analysis is the fact that many communications involve layers of protocols. A message of one protocol type may be transmitted in a datastream of another protocol. For example, remote procedure call (RPC) messages may be included within one or more hypertext transfer protocol (HTTP) messages, as shown in FIG. 2.
FIG. 2 illustrates the layers of messages an application-level RPC over HTTP protocol analyzer 200 will confront in attempting to monitor data communications in an RPC over HTTP system. The protocol analyzer 200 monitors communications in a datastream 202, which includes RPC messages that eventually will be passed to an RPC session 204. Because the RPC messages are transmitted in RPC over HTTP system, every type of message will involve two layers of message types, states, handlers, and transitions. To illustrate a few, examples, each RPC request 210 will be encompassed in an HTTP request 212. Similarly, an RPC acknowledgement 220 will be encompassed within an HTTP reply 222, and an RPC bind 230 will be included within another HTTP request 232. Because attacks on the system receiving the datastream 202 may be made at various points either at the HTTP or RPC levels, both protocols should be monitored to protect data security and data integrity.
Conventionally, to create an application-level protocol analyzer, one has to create a protocol analysis program ad hoc. Generally, a low-level programming language is used to create the protocol analyzer, because of the precise logical determinations and operations that will be involved in parsing and processing the messages received. In the case of layered protocols, as described in relation to FIG. 2, application-level protocol analyzers must be written to track both protocols to maintain data security and integrity at both protocols. Unfortunately, this can be a very complicated process, as described in connection with FIG. 3.
FIG. 3 diagrams a conventional process 300 for creating an application-level protocol analyzer. For the application for which the protocol analyzer is to be created, at 302, a next protocol for which messages will be analyzed for the application is identified. At 304, it is determined if a detailed specification exists describing how the application it responds to messages of the identified protocol and other events. This is a nontrivial aspect of the process, for there may not be a readily-available, sufficiently detailed specification from which a protocol analyzer can be created for the application. If it is determined at 304 that there is not an adequate specification for how the application responds to messages of the identified protocol, at 306, the specification is written.
Once a specification has been found or created, at 308, the code for the protocol analyzer is written. Thus, if no specification exists for the protocol, creating a protocol analyzer involves creating a specification, then writing a program to create the protocol analyzer. Protocol analyzers generally are developed using a general purpose, low-level programming language such as C. Creating the protocol analyzer usually requires understanding a large body of source code and writing thousands of lines of code. In addition, once the code is created at 308, at 310, the protocol analyzer is subjected to comprehensive testing to ensure that the code is free of errors and correctly models the response of the system to messages as indicated in the specification. Thus, the development and testing of a protocol analyzer is a demanding software development challenge.
Moreover, as described in connection with FIG. 2, an application may communicate using multiple protocols. Thus, at 312, it is determined if the application will receive and respond to messages in additional protocols. If so, the process 300 loops to 302 to identify the next protocol for which messages will be analyzed for the application. Once the protocol is identified, creation of the next protocol analyzer may involve writing an additional specification at 306 to detail how an application responds to messages of this next protocol, writing additional protocol analyzer code at 308, and further testing at 310. Once a protocol analyzer accounts for how an application responds to messages of the anticipated protocols has been created and tested, at 314, the protocol analyzer finally may be implemented.
In addition to the potentially burdensome process 300 illustrated by FIG. 3, there are additional concerns facing those that desire to create protocol analyzers. First, the labors of the process 300 to create a protocol analyzer may have to be independently borne for each application for which protocol analysis is desired. The ad hoc program used to create a protocol analyzer for one application may be difficult or impossible to adapt to another application, thus each application may involve creation of an entirely separate protocol analyzer. Second, available tools to simplify the process of creating protocol analyzers generally are only suited to processing binary protocols, and provide no help in developing a protocol analyzer to analyze text-based protocols.