The terms “downtime” and “network outage” are used to refer to periods when a communication system is unavailable. Downtime or outage duration refers to a period of time that a system fails to provide or perform its functions. The causes of a network outage include failures in various network components such as: hardware failures (e.g., servers and other physical equipment), software failures (e.g., logic controlling equipment), interconnecting equipment failures (e.g., cables, routers, etc.), wireless transmission failures (e.g., antennas, satellites, etc.), and capacity failures (e.g., exceeding system limits).
Typically, it is the responsibility of the network designers to ensure that a network outage does not happen. However, if a network outage does occur, a network monitoring system may reduce the effects of the outage by detecting and restoring the network as quickly as possible. The restoration of the network generally requires involvement from several individuals and teams of individuals including technical engineers, management personnel, executives, etc.
Within the field of telecommunications, mission critical applications, interfaces, middleware components and downstream systems are continually changing. With these changes come increased difficulty and challenges for engineers and support team members to stay up to date with the technical picture and their understandings of related components involved in the early stages of the outage, and throughout critical triage activities. In a time when minutes equals millions, communication and collaboration amongst telecommunication personnel during a network outage is antiquated and inefficient. Currently, there exists a critical knowledge and communication gap between interested parties (e.g., engineers, executives, etc.) due to a lack of a complete picture of precisely what is occurring during the outage as well as the impact created by the outage.