A computer network is a collection of interconnected computing devices that can exchange data and share resources. In a packet-based network, such as the Internet, the computing devices communicate data by dividing the data into small blocks called packets, which are individually routed across the network from a source device to a destination device. The destination device extracts the data from the packets and assembles the data into its original form. Dividing the data into packets enables the source device to resend only those individual packets that may be lost during transmission.
Certain network devices, such as routers, maintain tables of routing information that describe routes through the network. A “route” can generally be defined as a path between two locations on the network. Upon receiving an incoming data packet, the router examines destination information within the packet to identify the destination for the packet. Based on the destination, the router forwards the packet in accordance with the routing table.
Routers and other network devices can exhibit faults and fail to operate properly for various reasons, including, for example, environmental factors, network attacks, or component failures. When a router exhibits a fault, it can provide useful information, such as operational and configuration information, to a network administrator or other user. For instance, the router may store a system log that contains fault information that can assist the user in detecting faults. In many cases, however, the system log is difficult to parse, as it generally does not provide any prioritization or organization among various types of fault conditions. Rather, the system log is typically organized chronologically and can be somewhat random in the way in which it records fault information.
One fault condition of particular interest to a user is known as “flapping.” Flapping occurs when a router enters a fault condition and exchanges information with peer routers that rely on it for routes after temporarily ceasing communication with them. When a router restarts after a fault condition, it exchanges large amounts of information with its peer routers. This information may include configuration and security information, as well as routing table information. Routing tables in large networks may take a long period of time to converge to stable routing information after a network fault due to temporary oscillations in the routing information.
These oscillations in routing information, i.e., changes that occur within the routing tables until they converge to reflect the current network topology, are often referred to as “flaps.” These oscillations can cause significant problems, including intermittent loss of network connectivity as well as increased packet loss and latency. When flapping occurs, it is often informative to analyze communication between the routers, e.g., via the border gateway protocol (BGP), to determine which routers are causing the flapping and the conditions under which the flapping is occurring. Because the contents of the system log are not easily analyzed, however, the root cause of a flapping episode, and necessary remedial action, can be difficult to identify.