In conventional network computing environments, a number of devices are used to interconnect computing systems to efficiently transfer data over the network. In large-scale implementations, hundreds or thousands of network devices are deployed to interconnect the computing systems.
Troubleshooting a disruption in a large-scale and complex system can be difficult. For example, a host may experience connectivity issues or the flow of traffic between a segment in the network may be slow. There may be many different possible causes of these and other network disruptions, and discovering the root cause can be an arduous task. The troubleshooting process becomes increasingly intractable and time consuming as the systems become larger and more complex.
Specialized computer systems such as network management systems are dedicated to monitoring the status of network devices and the health of the network as a whole, and the information gathered may be used for troubleshooting as network disruptions arise. A network management system, which is a system attached to the network, gathers information about the topology of the network, the operational status of network devices and the interconnection among them, performance statistics of various segments of the network, and attempts to identify potential trouble spots in the network.
The network management system typically gathers this information by periodically polling network devices in the network. In large-scale network implementations, polling and monitoring of every device often requires a significant portion of network bandwidth and can cause inefficiencies in the network.