1. Field of the Invention
This invention relates generally to computer networks and more specifically to verifying the operation of an intermediate node.
2. Background Information
A computer network is a geographically distributed collection of interconnected communication links and segments for transporting data between nodes, such as computers. Many types of network segments are available, with the types ranging from local area networks (LANs) to wide area networks (WANs). LANs typically connect personal computers and workstations over dedicated, private communications links located in the same general physical location, such as a building or a campus. WANs, on the other hand, typically connect large numbers of geographically dispersed nodes over long-distance communications links, such as common carrier telephone lines. The Internet is an example of a WAN that connects disparate networks throughout the world, providing global communication between nodes on various networks. The nodes typically communicate over the network by exchanging discrete frames or packets of data according to predefined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP). In this context, a protocol consists of a set of rules defining how the nodes interact with each other.
Computer networks may be further interconnected by an intermediate network node, such as a route, having a plurality of ports that may be coupled to the networks. To interconnect dispersed computer networks and/or provide Internet connectivity, many organizations rely on the infrastructure and facilities of Internet Service Providers (ISPs). ISPs typically own one or more backbone networks that are configured to provide high-speed connections to the Internet. To interconnect geographically dispersed private networks, an organization may subscribe to one or more ISPs and couple each of its private networks to the ISPs' equipment. Here, the router may be utilized to interconnect a plurality of private networks or subscribers to an IP backbone network. Routers typically operate at the network layer of a communications protocol stack, such as the network layer of the TCP/IP communications architecture.
Simple networks may be constructed using general-purpose routers interconnected by links owned or leased by ISPs. As networks become more complex with greater numbers of elements, additional structure may be required. In a complex network, structure can be imposed on routers by assigning specific jobs to particular routers. A common approach for ISP networks is to divide assignments among access routers and backbone routers. An access router provides individual subscribers access to the network by way of large numbers of relatively low-speed ports connected to the subscribers. Backbone routers, on the other hand, provide transports to the backbone network and are configured to provide high forwarding rates on fast interfaces. ISPs may impose further physical structure on their networks by organizing them into points of presence (POP). An ISP network usually consists of a number of POPs, each of which comprises a physical location wherein a set of access and backbone routers is located.
As Internet traffic increases, the demand for access routers to handle increased density and backbone routers to handle greater throughput becomes more important. In this context, increased density denotes a greater number of subscriber ports that can be terminated on a single router. Such requirements can be met most efficiently with platforms designed for specific applications. An example of such a specifically designed platform is an aggregation router. Aggregation routers, or “aggregators,” are access routers configured to provide high quality of service (QoS) and guaranteed bandwidth for both data and voice traffic destined for the Internet. Aggregators also provide a high degree of security for such traffic. These functions are considered “high-touch” features that necessitate substantial processing of the traffic by the router.
Notably, aggregators are configured to accommodate increased density by aggregating a large number of leased lines from ISP subscribers onto a few trunk lines coupled to an Internet backbone. Increased density has a number of advantages for an ISP, including conservation of floor space, simplified network management and improved statistical performance of the network. Real estate (i.e., floor space) in a POP is typically expensive and costs associated with floor space may be lowered by reducing the number of racks needed to terminate a large number of subscriber connections. Network management may be simplified by deploying a smaller number of larger routers. Moreover, larger numbers of interfaces on the access router improve the statistical performance of a network. Packet networks are usually designed to take advantage of statistical multiplexing, capitalizing on the fact that not all links are busy all of the time. The use of larger numbers of interfaces reduces the chances that a “fluke” burst of traffic from many sources at once will cause temporary network congestion.
In addition to deployment at a POP, aggregators may be deployed in a telephone company central office. The large numbers of subscribers connected to input interface ports of the aggregator are typically small to medium sized businesses that conduct a substantial portion of their operations “on-line,” e.g., over the Internet. Each of these subscribers may connect to a particular aggregator over a high-reliability link connection that is typically leased from, e.g., a telephone company provider. The subscriber traffic received at the input interfaces is funneled onto at least one trunk interface. That is, the aggregator essentially functions as a large “fan-in” device wherein a plurality (e.g., thousands) of relatively low-speed subscriber input links is aggregated onto a single, high-speed output trunk to a backbone network of the Internet.
Failures in access routers may result in the loss of service to hundreds or thousands of subscribers. Thus, it is desirable to configure access routers to provide a high degree of availability in order to minimize the impact associated with failures. Unlike backbone routers, however, providing high availability in an access router can be quite involved. For example, backbone routers often employ specialized routing algorithms to automatically redirect traffic around malfunctioning backbone routers and therefore improve network availability by simply reconfiguring the network to use an alternative (redundant) link. However, this capability is not feasible with an access router. Here, subscriber-to-trunk and trunk-to-subscriber traffic patterns are often predominant, and these patterns may result in the aggregation of hundreds or thousands of dedicated access links at one point, where they are, as noted above, typically funneled into a larger trunk up-link to the backbone network. The cost of providing redundant subscriber links may be prohibitive except for the most extreme circumstances. Thus in access routers, availability is often provided in ways other than redundant links.
One prior technique often used to enhance the availability of access routers involves configuring the router as a redundant system containing two or more complete sets of control and forwarding plane elements where one set of elements is designated “active” and the other sets are designated “standby.” The active elements perform the normal control and forwarding plane functions of the router, such as packet processing, routing, and so on. The standby elements, on the other hand, may sit idle or simply loop on software that tests portions of the standby elements and/or monitors the status of the Is active elements. If an active element fails, a “switchover” is initiated which typically involves placing the active elements in a standby state and configuring a set of standby elements to assume the role of the active elements. This configuration may include loading operational firmware and various configuration information into the standby elements to make them active.
To ensure system availability in a redundant system, a standby element must be prepared to assume the role of an active element should a failure or change in configuration make the active element unavailable. A failure of a standby element may affect the availability of that element to assume the role of an active element and therefore affect the overall system availability. To enhance the efficacy of a redundant scheme, lessen the loss of service due to failure of an active element, and enhance availability of the standby elements, it is desirable to continuously verify the function of the standby elements. Ideally, such verification should meet the following requirements:
1) verifying as many functions of the standby element as possible that would be in use if the element were to operate as an active element;
2) not interfere with the operation of the active elements or the overall system; and
3) in order to lessen loss of service that may be experienced in the event of a switchover operation, allow the standby element to begin functioning as an active element as soon as possible.
Prior techniques that employ control and forwarding plane redundancy often do not meet or only partially meet the above requirements. These techniques typically use only hardware redundancy, or do not support ongoing functional verification of the standby forwarding-plane elements, or do not support fast switchover of the elements from the standby role to the active role, or require the system to be offline during standby verification. Moreover, these techniques provide limited assurance that a standby element is prepared to assume the role of an active element. As a consequence, a high degree of system availability using these techniques is often difficult, if not impossible, to achieve.