1. Field of the Invention
The present invention is directed to the management of computer networks, and more specifically to a method and system for determining availability of devices and paths within a computer network in order to determine and report on availability of the network.
2. Description of Related Art
Businesses and academic organizations throughout the world are now highly dependent on the operation of their computer networks and they make large investments of time and money in setting up and maintaining the same. MIS directors and system administrators work with and need to know and understand the workings of these networks, and in particular, need to be able to determine network "availability," i.e., if the network has been available to its users and if the network has been running efficiently. In doing so, it must be determined what measurements can be used to establish whether the network has been available. It is also necessary to keep track of the unavailability of the network to determine how much time has been lost because employees were prevented from doing their jobs. Finally, those components which impact network availability must be identified and their problems addressed.
In the past, network availability has been defined in different ways depending upon the focus of those persons doing the measuring and recording. For example, one purpose of availability measurement is to provide an early warning of a potential disaster, in which case network availability is defined as a ratio between MTBF (Mean Time Between Failures) and total time, where the total time equals MTBF plus MTTR (Mean Time To Repair).
In a communications network, device availability may be defined as a ratio between the time the device was available and a total time under evaluation. In this case, no consideration is given to the users and/or the functions of these devices where, for example, a wide-area link between two sites is inoperative and has no effect unless the users from one site are trying to access a facility on the other site. In other words, there may be instances where a device is inoperative, but has little or no impact on network operations because it is not being used or is lightly used.
It has been suggested that the availability of devices that are connected to each other can be aggregated, i.e., the aggregate is a product of all the availabilities of all devices in a particular path. This approach, however, has two flaws when applied to historic data. Firstly, when considering a single path between two devices, the resulting calculation will provide the most pessimistic number for the availability of the path and not the actual one. Using the most pessimistic number may mislead network managers about the actual functional availability of their networks. This inaccuracy results because the method does not account for any overlapping of down times of the devices under consideration. Secondly, the proposed approach fails to address a situation where there are multiple paths between devices in the communications network.
As a result, a system and method are necessary for providing a representation of network availability that is more comprehensive and realistic than known methods and which accounts for overlap between device down times or path down times when aggregating the availabilities for a network link. Further, a system is needed which takes into account network topology and any changes thereto.