In packet-switched networks, a router is a network device or, in some cases, software in a computer, that determines the next network point to which a packet should be forwarded toward its destination. The router is connected to at least two networks and decides which way to send each information packet based on its current understanding of the state of the networks it is connected to. A router is located at any gateway where one network meets another and is often included as part of a network switch.
Typically, packets are transported through a router by hardware and software operating in a data plane which is in turn controlled by hardware and software operating in a control plane. In general, the control plane includes the hardware and software that handles non-wire speed functions and data that are required to operate a network device or network. These functions include connection, setup, and tear down, operations, administration, and management. In general, the data plane includes the hardware and software that handles the classification, modification, scheduling, and transmission of wire-speed application data. The control and data planes maybe combined into a single processing plane. In addition, the processing plane may include the router's switch fabric.
To improve availability, a router may be equipped with redundant (i.e., two) control, data, or processing planes. A first control plane, for example, is designated as the active control plane and a second control plane is designated as the inactive control plane. In the event that a device in the active control plane fails, the inactive control plane takes over to reduce down time and hence maintain availability of the router. In such a case, activity is said to switch from the active control plane to the inactive control plane, that is, the two planes exchange roles. Routers and other network devices having redundant systems (i.e., control or data plane devices) are often referred to as “high availability” systems. Thus, a typical high availability router may have two main processing cards that run the same software and perform the same operation. If one card fails in the field, the other card takes over in order to keep the router up and running. Such a router is highly available as the card redundancy ensures that the router is almost always operable or available.
Thus, in a redundant or high availability system, two redundant control planes or cards typically run the same software as mentioned above. Even if both control plane cards are running, the system is still one system and therefore only one control card can configure and operate the system. This one card is the active card. The other card remains in a standby mode monitoring what is going on within the system. It is the inactive card. If the active card fails, then the inactive card takes over and becomes the active card. This is an activity switch. An activity switch can occur due to a failure of the active card, but it is also possible to trigger an activity switch by removing the active card from the system to perform an upgrade, for example. An activity switch may also be generated by entering a software command.
When designing a redundant system, the use of parts or components that were not originally designed for redundancy may be required. Such parts or components may be referred to as non-redundant parts or components. This requirement may be due to a number of reasons which may include availability and cost advantages of the non-redundant parts. However, one problem with using parts that were not designed for redundancy is that such parts may not behave properly or as expected during activity switches. For example, non-redundant parts may not be able to handle the corrupted data that they will typically receive during an activity switch. As such, the use of non-redundant parts may result in unexpected behaviour leading to catastrophic events such as device lockups and unpredictable data loss. Avoidance of such catastrophic events is clearly desirable. Consequently, non-redundant parts have been incorporated in redundant systems through the use of a monitoring device that functions to detect a catastrophic event and reset the non-redundant parts to a known good state. However, such methods typically take significant time to recover from a fault and hence cause much inconvenience to end users.
A need therefore exists for an improved method and system for incorporating non-redundant components in redundant systems such as high availability routers. Accordingly, a solution that addresses, at least in part, the above and other shortcomings is desired.