Various embodiments of the present invention relates to methods and apparatus for determining one or more common factors or one or more causes which explain one or more threshold notifications in a data communications network.
Network traffic monitoring is a vital part of network management. A network typically comprises two or more devices which are connected together using some form of communication medium. For example, in a computer network, two or more processing nodes may be connected to, either wirelessly or by wire(s), one or more network devices such as routers or switches which are in turn connected to other network devices. These devices and nodes may also be connected to some non-processing devices such as network printers, facsimile machines, or other resources to share these devices and resources. As another example, a telecommunication network may comprise two or more telecommunications links and nodes such that one or more messages may be transmitted from one part of the telecommunication network to another through one or more links and nodes. In fact, in a typical network, whether computer network, telecommunications network, or other types of network, most activities would produce some network traffic.
On the other hand, a network constitutes a resource which is shared among the nodes. Such resources normally have certain bandwidth limitations on the amount of information that may be transmitted at any given instant in time. That is, the more network resource that one or more nodes on a network utilize at a given instant in time or during a period of time, the less amount of such a shared network resource will be available for the other nodes on the same network at the same instant or during the same period in time. In other words, these other nodes on the same network may be adversely affected if the network is overloaded with existing network traffic.
As a result, monitoring network traffic provides important information for the smooth operation of a network. Monitoring network traffic may also be essential for operating cost allocation, network capacity planning, fault detection and isolation, security management, or service quality analysis.
A common practice for monitoring network traffic is to maintain one or more counters which relate to the amount of information being transmitted across the entire network. More particularly, the network may employ methods or apparatus to measure the amount of activities on each link, through each node or device, or across the entire network. For example, the methods or apparatus may periodically sample the one or more counters and determine the differences or changes in each of the one or more counters. Such differences or changes in each of the one or more counters may be configured or defined to indicate the amount of network traffic during the sampling period. A typical sampling process may be a time-based sampling process, which has been proven to be less accurate than packet-based sampling. After the sampling process, some current approaches may proceed further to determine one or more thresholds for each of the one or more counters. These thresholds define the levels of network traffic beyond which may cause performance issues or other negative impacts. These approaches may then generate a notification once it is determined that certain thresholds have been exceeded.
In the past, a large number of nodes may be connected to a network with shared resources. In this type of network, a single device connected to the network may be sufficient to monitor all the traffic. Nowadays, this may not be the case as networking has become more complicated. For example, a computer network may contain several network segments, each of which constitutes a portion of the network wherein every device communicates using the same physical layer. In such a computer network, nodes or devices operating at layer two (the data link layer) or higher layers create new physical layers and create other network segments. In such a network, responding to threshold notifications may be quite challenging.
Network management systems or administrators often identify the causes or factors which contribute to network activities. For example, the network management system may identify the causes or factors contributing to network activities by using numerous ways to obtain information about packets transmitted in the computer network. Commonly, a network probe may be attached to the computer network and monitor packets transmitted across the computer network. Alternatively, network elements such as, but not limited to, wireless access points, switches, routers, and hosts may be used to monitor packets transmitted through these network elements and to report on the traffic with technologies such as sFlow, Netflow, IPFIX, or RMON.
For example, in order to identify a cause of such a threshold notification which often corresponds to a network violation, it may be required to examine the communications or transmitted information which traverses the adversely affected network resources. In a modern computer network containing multiple network devices, the network may generate simultaneous threshold notifications. Although some of the threshold notifications may be initiated by causes totally independent from or irrelevant to each other some other threshold notifications may nonetheless be related and thus make the identification of a cause of a threshold notification even more difficult.
Currently, identifying and analyzing the causes of the threshold notifications often requires a manual analysis of the traffic through each of the nodes or device along the communication path which leads to the generation of threshold notifications. Such an approach not only relies heavily on the experience and expertise of network administrators or whoever is responsible for monitoring the network traffic but also runs the inherent risk of inaccurately identifying or misidentifying the causes of such threshold notifications.
The disclosure in the U.S. patent application Ser. No. 11/846,357 which is entitled “METHOD, SYSTEM, AND COMPUTER PROGRAM PRODUCT FOR IDENTIFYING COMMON FACTORS ASSOCIATED WITH NETWORK THRESHOLD VIOLATIONS” provides various methods and an apparatus to efficiently identify the causes or factors of excessive network activities such that prompt control actions may be taken to mitigate the adverse effects on network performance. The aforementioned cross-related Application further provides periodic updates of counters and information on the information being transmitted across the entire network to aid the identification of factors or causes for such threshold notifications.
Nonetheless, determining common factors in a typical network may require large amounts of computation resources such as memory or processing time. Often, there may be millions or more packets with many different combinations of factors transmitted in a moderately sized computer network at a given instant in time. Creating and handling records for all the combinations of factors present in the network traffic so as to calculate each factor's relative importance of the factor's contribution to the network traffic is often impractical or even prohibitive.
U.S. Pat. No. 5,646,956 issued to Pinna (hereinafter Pinna) discloses a method for calculating the top contributors to a single factor. Nonetheless, Pinna becomes less effective as the number of network activity entries in the table becomes larger since Pinna's method traverses the long list of entries with each table update. As such there exist a need for a more effective method and apparatus to determine common factors contributing to network activities.
As such, it is an objective of various embodiments of the present invention to provide a method and an apparatus to determine one or more common factors with reduced utilization of computational resources.