Networks are used to interconnect multiple devices, such as computing devices, and allow the communication of information between the various interconnected devices. Many organizations rely on networks to communicate information between different individuals, departments, work groups, and geographic locations. In many organizations, a network is an important resource that must operate efficiently. For example, networks are used to communicate electronic mail (e-mail), share information between individuals, and provide access to shared resources, such as printers, servers, and databases. A network failure or inefficient operation may significantly affect the ability of certain individuals or groups to perform their required functions.
A typical network contains multiple interconnected devices, including computers, servers, printers, and various other network communication devices such as routers, bridges, switches, and hubs. The multiple devices in a network are interconnected with multiple communication links that allow the various network devices to communicate with one another. If a particular network device or network communication link fails, multiple devices, or the entire network, may be affected.
Network management is the process of managing the various network devices and network communication links to provide the necessary network services to the users of the network. Typical network management systems collect information regarding the operation and performance of the network and analyze the collected information to detect problems in the network. For example, a high network utilization or a high network response time may indicate that the network (or a particular device or link in the network) is approaching an overloaded condition. In an overloaded condition, network devices may be unable to communicate at a reasonable speed, thereby reducing the usefulness of the network. In this situation, it is important to identify the network problem and the source of the problem such that the proper network operation can be restored.
Typically, existing network management systems compare the current network performance parameters to one or more threshold values associated with the parameters. For example, an upper threshold of 90 percent may be associated with a network utilization parameter, such that an alarm is generated if the network utilization exceeds 90 percent. A network utilization in excess of 90 percent may indicate an approaching overload condition. Similar thresholds may be provided for other network parameters such as response time or number of errors. Generally, only an upper threshold is provided (e.g., network utilization above a particular threshold).
Typically, thresholds used by existing systems are absolute such that an alarm is generated each time a threshold is crossed. For example, if a network utilization parameter has an associated upper threshold of 90 percent, an alarm is generated if the network utilization is 91 percent for a specified period of time. However, an alarm is not generated if the network utilization is 90 percent for a long period of time. Also, if network utilization drops to five percent because a periodic backup process was not activated, an alarm is not generated since the threshold associated with network utilization was not exceeded.
These existing systems that use thresholds to identify network problems determine the threshold values (e.g., upper limits) based on how the network administrator believes the network should operate. Since these thresholds are typically static, they do not change automatically with changes in the network operation or network configuration. Instead, the network administrator must recalculate (or re-estimate) threshold values manually when a network change occurs. Typically, a single set of threshold values are used for all time periods. Thus, the same thresholds may be used during periods of heavy network utilization (e.g., 2:00 p.m. on a business day) and during periods of minimal network utilization (e.g., 10:00 p.m. on a holiday). Thus, the same threshold values are applied at all times, regardless of the expected or historical network utilization.
For example, if a significant increase in network utilization occurs every Monday at 9:00 a.m. (i.e., the upper threshold is crossed), an alarm may be generated every Monday even though this is a common event that does not necessarily indicate a network problem. Similarly, if a significant increase in network utilization occurs (without crossing the upper threshold) at a time when the network utilization is typically minimal, an alarm is not generated even though a network problem may exist. Thus, existing systems do not consider typical or historical network operation when determining whether a network problem exists.
It is therefore desirable to provide a network-related monitoring system that detects problems or potential problems in a network environment by comparing recent network operation with historical network operation.