The present invention relates to the field of computer system communication networks. In particular, the present invention pertains to network monitoring and management.
Computer systems linked to each other in a communication network are commonly used in businesses and like organizations. Computer system communication networks (xe2x80x9cnetworksxe2x80x9d) are growing in sizexe2x80x94as measured by the number of applications and the number of users they supportxe2x80x94due to improvements in network reliability and the recognition of associated benefits such as increased productivity.
As the size of networks increases and as organizations become more reliant on such networks, the importance of effective network management tools also grows. In response to the need for standardization of such tools, primarily to control costs but also because components in a network are likely to originate from many different vendors, the Simple Network Management Protocol (SNMP) was developed and widely adopted. Since implementation of SNMP, a supplement to SNMP known as Remote Network Monitoring (RMON) has been issued, and RMON has been subsequently extended with an addition known as RMON2. RMON and RMON2 provide SNMP with the capability for remote network monitoring; that is, a network manager is able to monitor network performance from a central computer system that has access to other components on the network, referred to as RMON probes, that monitor local areas of the network.
SNMP, RMON and RMON2 thus are network management software tools that provide a set of standards for network management and control, including a standard protocol, a specification for database structure, and a set of data objects. SNMP, RMON and RMON2 are implemented in a network through management information bases (MIBs) which contain instructions specifying the data that are to be collected, how the data are to be identified, and other information pertinent to the purpose of network monitoring. The MIBs are implemented through the RMON probes to monitor the local areas of the network.
Network managers use the SNMP, RMON and RMON2 standards to collect information regarding the performance of the network. By collecting information about network performance and analyzing it, the network manager is able to recognize situations indicating that either a problem is present or impending.
For example, the network manager (or any of the network users, for that matter) may be interested in obtaining performance statistics such as the average and worst-case performance times and the reliability of the network for a particular network application. Such applications generally describe a transaction between a user that is accessing the network through a client computer system and a server computer system that responds to the client computer system with the requested information. Network managers need performance statistics to help them manage and maintain the network and to plan for network improvements. For example, performance statistics can be used to recognize bottlenecks in the network before they cause problems so that corrective action can be taken. If the performance statistics indicate a growing load in one area of the network, network traffic (in the form of data packets that travel through the network""s communication equipment) can be routed along a different path. Statistics accumulated over a longer period of time can be used to help decide whether it is necessary to expand particular areas of the network.
Performance statistics are also necessary for businesses and the like to determine whether the network support provided by a vendor of network management services is satisfactory or not. Many businesses contract with vendors for network management services. Such contracts are typically implemented with service level agreements (SLAs) which specify metrics against which the provider of the network management services is measured. These metrics are used to quantify standards of performance that allow businesses to assess not only the performance of the network but also the performance of the network management services provider. SLAs generally include a provision specifying metrics for performance time for critical network applications, where performance time, for example, is considered to be the amount of time between the time a user submits a request via the network and the time until the response to that request is received by the user. An effective network management tool should therefore provide a means for monitoring the network and gathering performance statistics for comparison against the requirements contained in the SLAs. However, as will be seen in the discussion below, the network management tools in the prior art do not provide a ready means of demonstrating compliance with SLAs.
Prior art network management tools have trouble aiding the network manager in determining whether a problem within the network is associated with the network or with the system hardware supporting the network, so that the network manager can identify and implement the appropriate corrective action. For example, if a user places a request for a particular network application to a server computer and a response is not received, the prior art network management tools do not generally identify whether the problem is occurring because of a bottleneck in the network or because the server is not functioning. Therefore, as will be seen in the discussion to follow, the network management tools in the prior art do not provide a ready means of monitoring performance of the entire network so that problems can be quickly detected.
With reference to FIG. 1, a prior art method used for network monitoring is illustrated for a simplified network 100. Network 100 is typically comprised of a plurality of client computer systems 110a, 110b and 110c networked with a number of different servers 130a, 130b and 130c. For this discussion, the focus is on client computer system 110c connected via communication lines 120 and 122 to server computer system 130c. Data packets (not shown) from client computer system 110c travel to server computer system 130c and back on either of communication lines 120 and 122, depending on the amount of traffic present on those lines due to simultaneous communications between client computer systems 110a and 110b and server computer systems 130a, 130b and 130c. The request data packets issued from client computer system 110c contain data that specify the address of client computer system 110c and the address of destination server computer system 130c, as well as other data pertinent to the network application being used, such as data defining the request being made. The response data packets issued from server computer system 130c also contain the sender and destination address as well as other data needed to respond to the request.
With reference still to FIG. 1, coupled into communication lines 120 and 122 are other communications equipment such as switches 124 and 125 and routers 126 and 127. Also on communication lines 120 and 122 are RMON probes 140 and 142 (the term xe2x80x9cRMONxe2x80x9d refers to both RMON and RMON2). An RMON probe typically operates in a promiscuous mode, observing every data packet that passes only through the communication line to which it is coupled.
RMON MIBs provide the capability to define filters that can be used to limit the number of data packets observed by an RMON probe that are to be captured or counted. Filters are specified using known RMON MIBs and are based on the type of data packet or other packet characteristics associated with the data contained within the data packet. Filters permit the RMON probe to screen observed data packets on the basis of recognition characteristics specified by the filter. Data packets are captured or counted by the RMON probe on the basis of a match (or a failure to match) with the specified recognition characteristics. Filters can be combined using logical xe2x80x9candxe2x80x9d and xe2x80x9corxe2x80x9d operations to define a more complex filter to be applied to data packets, thus focusing the screen onto a narrower group of data packets. Filters are widely used for network management because of the flexibility for defining the type of data packets to be captured and monitored, thus preferentially limiting the number of data packets captured by an RMON probe to the particular types of data packets of interest. Filters permit the data packets to be sorted by their type or contents, such as by the type of network application being performed, thereby achieving more discrete and meaningful performance statistics than might be achievable without the use of filters.
However, RMON relies on standard and established methods of identification, such as port identification numbers, to identify a network application contained within a data packet. This is problematic in the prior art because a significant portion of network applications do not use a standard and established port identification number, while other network applications use a port identification that is identical to that of another network application. Thus, in the prior art, it is not possible to identify a network application in a significant amount of the instances where it is necessary to do so in order to collect meaningful and helpful performance statistics.
In addition, the prior art relies on information included in data packets being transmitted xe2x80x9cin the clear;xe2x80x9d that is, the information is riot encrypted and can be intercepted and read by anyone with access to the network in which the data packet is transmitted. However, as a means of increasing the security of information transmitted in networks, businesses and network services providers are finding it increasingly desirable to encrypt the information contained in the data packet so that it can only be read by an authorized person with access to the encryption process. However, in the prior art, information in the data packet cannot be deciphered by RMON filters when that information is encrypted. Thus, a disadvantage to the prior art is that when a data packet is encrypted, filters cannot read the data packets, so that data packets cannot be recognized and sorted. Thus, in the prior art, information that would be useful to the network manager to permit interpretation of performance statistics is not available for encrypted data packets. It is also not possible to correlate request and response packets; consequently, performance times cannot be measured and conformance to the SLA cannot be demonstrated. Finally, with the use of encrypted data packets, in the prior art it is not possible to differentiate a data packet for one network application from a data packet for another network application. This information may be useful to the network manager for interpreting performance statistics, and it is also needed to prioritize applications being run on a network in accordance with the prioritization standards established by the Institute of Electrical and Electronics Engineers (IEEE) and the Internet Engineering Task Force (IETF).
Packet monitoring using probes (as shown in FIG. 1) is also problematic when data switching is used in network 100. Assume a user issues a request data packet (not shown) from client computer system 110c that is routed through communications line 120 to server computer system 130c. RMON probe 140 observes the request data packet, and captures and counts the data packet. Server computer system 130c responds to the request data packet and transmits a response data packet (not shown). However, because of increased traffic on communications line 120, the response data packet is more efficiently routed back to client computer system 110c through communications line 122 and is observed by RMON probe 142. RMON probe 142 captures and counts the data packet.
In the prior art, the RMON probes are only capable of making a count of the number of captured data packets, which provides only a limited measure of the performance of the network. Thus, one drawback to the prior art is that, because of the nature of switched networks, a data packet may take one route from a client computer system to a server computer system and a different route back, and therefore the packets are never correlated because they are counted by two different probes, and each probe operates independently. Hence, in the prior art, a response data packet is not correlated with the request data packet that prompted the response.
For example, the network manager would expect that the number of captured response data packets and captured request data packets would be equal, and if not, this would provide an indication of a potential problem on the network. However, this information only indicates the reliability of the network for carrying data packets, or the reliability of a server computer system to respond to a request, but does not provide a measure of the time it took to respond to the request. Therefore, another drawback to the prior art is that it does not measure performance times such as application response time, application processing time, or protocol latency, because request and response data packets might not be correlated if they are captured by different probes. Thus, in the prior art the network manager or a user does not have the desired information regarding the average and worst-case performance times. Hence, another drawback to the prior art is that the network services provider cannot readily demonstrate compliance to the governing SLA.
With reference again to FIG. 1, it is possible that, after the response data packet passes RMON probe 142 and is counted by RMON probe 142, a fault on communications line 122 may occur so that the response data packet is not delivered to client computer system 110c. For example, a failure of switch 125 may occur so that the response data packet is not able to complete its journey. However, in the prior art the response data packet may still be counted as a successful transaction. Thus, an additional disadvantage to the prior art is that a fault in the network may not be detected by the network monitoring software, and would only be eventually noticed by the user who did not receive a response to his/her request. Another drawback to the prior art is that a fault in the network may not be noticed in a timely manner. An additional drawback to the prior art is that the accuracy of the performance statistics may be affected by the location of the RMON probes.
One prior art system attempts to address some of the disadvantages identified above by incorporating RMON into routers or switches instead of a probe, and adding a plurality of these components to the network. However, a disadvantage to this prior art system is that the speed at which the component (e.g., a switch) performs its primary function is significantly slowed by the addition of the network monitoring function, because of the complexity of RMON MIBs. In addition, another drawback to this prior art system is that the cost of the component such as a switch is substantially increased by the incorporation of the RMON facilities This prior art system also does not address the other disadvantages identified above, such as the inability to measure performance times and demonstrate compliance with SLAs in a switched communication system.
Accordingly, a need exists for a method to monitor a computer system communication network that readily and quickly detects and identifies a degradation of the network. A need further exists for a method that accomplishes the above and enables the network manager to demonstrate compliance with the provisions of the governing SLA. A need yet exists for a method that accomplishes the above and also provides an accurate measure of the network performance as well as its reliability. A need exists yet further for a method that accomplishes the above and can determine application information for network applications that do not use established means for identification or that are transmitted in encrypted data packets. Finally, a need exists for a method that accomplishes the above without interfering with the processing of a network application. The present invention solves these needs. These and other objects and advantages of the present invention will become obvious to those of ordinary skill in the art after having read the following detailed description of the preferred embodiments which are illustrated in the various drawing figures.
The present invention provides a method to monitor a computer system communication network that readily and quickly detects and identifies a degradation of the network. The present invention also provides a method that accomplishes the above and enables the network manager to demonstrate compliance with the provisions of the governing service level agreement (SLA). The present invention further provides a method that accomplishes the above and also provides an accurate measure of the network performance as well as its reliability. Finally, the present invention provides a method that accomplishes the above and can determine application information for network applications that do not use established means for identification or that are transmitted in encrypted data packets.
The present invention described herein provides a method for quantifying performance of a communication network having computer systems communicatively coupled to each other with communication equipment, where the computer systems are executing network applications that send and receive either encrypted or unencrypted data packets over the communication network. In one embodiment, between an application program interface and a protocol stack in a computer system, where the application program interface resides in an application layer of the computer system and the protocol stack resides in a kernel layer of the computer system, the present invention executes a process for identifying a network application, where the network application originates a request data packet and a response data packet. Second, in this embodiment the present invention records time-stamps when the request data packet and the response data packet are between the application program interface and the protocol stack in the computer system. Third, in this embodiment the present invention computes a difference between a first time-stamp and a second time-stamp. In this embodiment, the present invention next calculates performance statistics measured on the difference and stores the performance statistics in a memory unit of the computer system, where the memory unit is read-accessible and write-accessible from both the application layer and the kernel layer of the computer system. Finally, in this embodiment of the present invention, the computer system reports the performance statistics to a central computer system.
In another embodiment, the present invention determines application information corresponding to a network application. In this embodiment, the present invention reads data contained in a data packet, where the data corresponds to the network application and identifies an executable memory unit location where the application information is stored. The present invention then reads the application information from the executable memory unit location.
In still another embodiment, the present invention applies an encryption process between the application program interface and the protocol stack in a computer system to define encryption recognition characteristics corresponding to an unencrypted data packet. The present invention stores the encryption recognition characteristics in a memory unit within the computer system. The present invention then compares the encryption recognition characteristics to an encrypted data packet, thereby correlating the encrypted data packet to the unencrypted data packet.