1. Field of the Invention
The present invention relates to data communication networks. More particularly, the present invention relates to a scalable apparatus and method for dynamic selection of explicit routes and dynamic rerouting in data communication networks and internetworks such as the Internet.
2. Background
As is known to those of ordinary ski in the art, a network is a communication system that allows users to access resources on other computers and exchange messages with other users. A network is typically a data communication system that links two or more computers and peripheral devices. It allows users to share resources on their own systems with other network users and to access information on centrally located systems or systems that are located at remote offices. It may provide connections to the Internet or the networks of other organizations. The network typically includes a cable that attaches to network interface cards (“NICs”) in each of the devices within the network. Users may interact with network-enabled software applications to make a network request (such as to get a file or print on a network printer). The application may also communicate with the network software, which may then interact with the network hardware to transmit information to other devices attached to the network.
FIG. 1 is a block diagram illustrating an exemplary network 100 connecting a user 110 and a particular web page 120. FIG. 1 is an example that may be consistent with any type of network known to those of ordinary skill in the art, including a Local Area Network (“LAN”), a Wide Area Network (“WAN”), or a combination of networks, such as the Internet.
When a user 110 connects to a particular destination, such as a requested web page 120, the connection from the user 110 to the web page 120 is typically routed through several internetworking devices such as routers 130-A-130-I. Routers are typically used to connect similar and heterogeneous network segments into internetworks. For example, two LANs may be connected across a dialup, integrated services digital network (“ISDN”), or across a leased line via routers. Routers may also be found throughout internetwork known as the Internet End users may connect to a local Internet service provider (“ISP”) (not shown).
As shown in FIG. 1, multiple routes are possible to transmit information between user 110 and web page 120. Networks are designed such that routers attempt to select the best route between computers such as the computer where user 110 is located and the computer where web page 120 is stored. For example, based on a number of factors known to those of ordinary skill in the art, the route defined by following routers 130-A, 130-B, 130-C, and 130-D may be selected. However, the use of different routing algorithms may result in the selection of the route defined by routers 130-A, 130-E, 130-F, and 130-G, or possibly even the route defined by routers 130-A, 130-B, 130-H, 130-I, 130-F, and 130-G. A detailed discussion of the aspects of routing algorithms that determine the optimal path between two nodes on a network is not necessary for the purposes of the present invention, and such a discussion is not provided here so as not to overcomplicate the present disclosure.
Routers such as routers 130-A-130-I typically transfer information along data communication networks using formatted data packets. For example, when a “source” computer system (e.g., computer 110 in FIG. 1) wishes to transmit information to a “destination” computer system (e.g., computer 120 in FIG. 1), it generates a packet header in an appropriate format which typically includes the address of the source and destination end system, and then fills the remainder of the packet with the information to be transmitted. The complete data packet is then transmitted to the router attached to (and responsible for) the source computer system, which forwards it toward the destination computer system. Packets transmitted among the routers themselves (typically referred to as “control packets”) are similarly formatted and forwarded.
When a router receives a data packet, it reads the data packet's destination address from the data packet header, and then transmits the data packet on the link leading most directly to the data packet's destination. Along the path from source to destination, a data packet may be transmitted along several links and pass through several routers, with each router on the path reading the data packet header and then forwarding the data packet on to the next “hop.”
To determine how data packets should be forwarded, each router is typically aware of the locations of the network's end systems (i.e., which routers are responsible for which end systems), the nature of the connections between the routers, and the states (e.g., operative or inoperative) of the links forming those connections. Using this information, each router can compute effective routes through the network and avoid, for example, faulty links or routers. A procedure for performing these tasks is generally known as a “routing algorithm.”
FIG. 2 is a block diagram of a sample router 130 suitable for implementing an embodiment of the present invention. The router 130 is shown to include a master control processing unit (“CPU”) 210, low and medium speed interfaces 220, and high speed interfaces 230. The CPU 210 may be responsible for performing such router tasks as routing table computations and network management. It may include one or more microprocessor integrated circuits selected from complex instruction set computer (“CISC”) integrated circuits, reduced instruction set computer (“RISC”) integrated circuits, or other commercially available processor integrated circuits. Non-volatile RAM and/or ROM may also form a part of CPU 210. Those of ordinary skill in the art will recognize that there are many alternative ways in which such memory can be coupled to the system.
The interfaces 220 and 230 are typically provided as interface cards. Generally, they control the transmission and reception of data packets over the network, and sometimes support other peripherals used with router 130. Examples of interfaces that may be included in the low and medium speed interfaces 220 are a multiport communications interface 222, a serial communications interface 224, and a token ring interface 226. Examples of interfaces that may be included in the high speed interfaces 230 include a fiber distributed data interface (“FDDI”) 232 and a multiport Ethernet interface 234. Each of these interfaces (low/medium and high speed) may include (1) a plurality of ports appropriate for communication with the appropriate media, and (2) an independent processor, and in some instances (3) volatile RAM. The independent processors may control such communication intensive tasks as packet switching and filtering, and media control and management. By providing separate processors for the communication intensive tasks, this architecture permits the master CPU 210 to efficiently perform routing computations, network diagnostics, security functions, and other similar functions.
The low and medium speed interfaces are shown to be coupled to the master CPU 210 through a data, control, and address bus 240. High speed interfaces 230 are shown to be connected to the bus 240 through a fast data, control, and address bus 250, which is in turn connected to a bus controller 260. The bus controller functions are typically provided by an independent processor.
Although the system shown in FIG. 2 is an example of a router suitable for implementing an embodiment of the present invention, it is by no means the only router architecture on which the present invention can be implemented. For example, an architecture having a single processor that handles communications as well as routing computations would also be acceptable. Further, other types of interfaces and media known to those of ordinary skill in the art could also be used with the router.
At a higher level of abstraction, FIG. 3 is a block diagram illustrating a model of a typical router system that is applicable in the context of the present invention. As shown in FIG. 3, a networking device such as a router 130 may be modeled as a device having a plurality of input interfaces 310a-310n, each having a corresponding input interface queue 320a-320n. Each input interface 310 receives a stream 330a-330n of data packets 340a-340z, with each data packet 340 typically arriving at a variable rate and typically having a variable length (usually measured in bytes). In addition to the data “payload” in each packet, each packet contains header information, which typically includes a source address and a destination address. Currently, the dominant protocol for transmitting such data packets is the Internet Protocol (“IP”). However, as will be described more fully in subsequent portions of this document, embodiments of the present invention can be implemented using any routing protocol known to those of ordinary skill in the art.
As each new data packet 340 arrives on an interface 310k, it is written into a corresponding input interface queue 320k, waiting for its turn to be processed. Scheduling logic 350 determines the order in which input interfaces 310a-310n should be “polled” to find out how many data packets (or equivalently, how many bytes of data) have arrived on a given interface 310k since the last time that interface 310k was polled. Scheduling logic 350 also determines the amount of data that should be processed from a given interface 310k during each “polling round.”
Regardless of the specific form of scheduling logic 350 used, when scheduling logic 350 determines that a particular data packet 340i should be processed from a particular input interface queue 320k, scheduling logic 350 transfers the data packet 340i to subsequent portions of the networking device (shown as dashed block 355) for further processing. Eventually, data packet 340i is written into one of a plurality of output queues 360a-360q, at the output of which the data packet 340i is finally transmitted from the networking device the corresponding output interface 370a-370q. Fundamentally, then, the packet forwarding component of a router performs the function of examining the source and destination address of each data packet and identifying one from among a plurality of output interfaces 370a-370q on which to transmit each data packet.
FIG. 4 is a flow chart illustrating a more detailed view of the packet forwarding operations performed on a typical router. As shown in FIG. 4, at step 405, a data frame is received, and in step 410, it is stripped of all its data link layer frame information. The exposed data field then contains the actual network layer data packet. Next, at step 415, the network layer protocol is identified. As mentioned earlier, there are several network layer protocols that may be routed in a data communication network, with the Internet Protocol (“IP”) being dominant in current wide area networks such as the Internet.
Still referring to FIG. 4, after identifying the network protocol in step 415, the router determines whether the protocol is one that can be routed in step 420. If the protocol is not one that can be routed, then at step 425, the link layer header is added back to the received data frame, and at step 430 the frame is simply transmitted using link layer techniques known to those of ordinary skill in the art, such as switching or bridging.
If the packet protocol is determined to one that can be routed in step 420 (e.g., if it is an IP packet), then at step 435, the router first performs housekeeping functions known to those of ordinary skill in the art, and then the router “looks up” the destination IP address in its routing table to identify the appropriate router output interface (also called a “port”) on which to transmit the received packet. At step 440, the router determines whether the destination port is directly attached to the router. If so (i.e., if the destination port is another port on the router), then at step 445, the link layer header is added back to the packet with the original link layer destination address, and then at step 450 the reassembled link layer frame is transmitted through the port identified at step 435.
Otherwise, if at step 440 the router determines that the destination port is not directly attached to the router (indicating that another router hop needs to occur), the Media Access Control (“MAC”) address of the next hop may be added to the packet, and a new link layer header with this MAC destination address extracted from the routing table is added to the frame at step 455, and then at step 460 the frame is transmitted through the port identified at step 435.
With the integration of voice, video, and data traffic, the aggregate bandwidth requirement of applications is getting higher. More network infrastructures are being built with high-end routers and switches with a large number of interfaces. As a result, network connectivity is getting richer than ever before. Multiple selections of routes for any given source and destination pair are often available. However, as is known to those of ordinary skill in the art, existing routing protocols, such as the Open Shortest Path First (“OSPF”) protocol, the Routing Information Protocol (“RIP”), the Enhanced Interior Gateway Routing Protocol, and the Border Gateway Protocol (“BGP”) are essentially destination-based routing protocols, which means that all packets with the same destination are typically forwarded along a minimum-hop path to their destination.
With destination-based routing protocols, routes are typically calculated automatically at regular intervals by software in routing devices. To enable this type of dynamic routing, routing devices contain routing tables, which essentially consist of destination address/next hop pairs. As an example, an entry in a routing table (i.e., a routing table as used in step 435 of FIG. 4) may be interpreted as follows: to get to network IP address 123.045.0.0, send the packet out Ethernet interface ‘0’ (e.g., “E0”). Thus, IP datagrams, or packets, travel through internetworks one hop at a time. The entire route is not known at the onset of the journey, however. Instead, at each hop, the next destination is calculated by matching the destination address within the datagram with an entry in the current node's routing table. Each node's involvement in the routing process is limited to forwarding packets based on internal information. The nodes typically do not monitor whether the packets get to their final destination, nor does the IP protocol have the capability to report errors back to the source when routing anomalies occur.
Although networks are designed to match expected traffic load, the actual traffic load in data communication networks changes dynamically depending on the time of day, and from day to day as well. A special event (e.g., the final game of the soccer World Cup) can dramatically change the network traffic pattern. In addition, network devices can become disabled or may be added to a network, and these events can also change the network traffic pattern.
As is known to those of ordinary skill in the art, destination-based minimum-hop routing algorithms (such as OSPF, RIP, and BGP) typically select the “shortest path” to a destination based on a metric such as hop count. These protocols therefore have limited capacity to balance the network load. With these protocols, when the traffic load is concentrated, some links are heavily loaded, while others sit idle. Thus, when using these routing algorithms, a network cannot dynamically adjust its forwarding paths to avoid congested links.
Traffic engineering is an important network service that achieves network resource efficiency by directing certain traffic flows to travel through explicitly defined routes that are different from the default paths determined by the routing protocols (e.g., OSPF, EIGRP, or BGP). As is known to those of ordinary skill in the art, the key to successful traffic engineering is the ability to have a set of network mechanisms that supports both explicit routes and dynamic rerouting. Dynamic rerouting refers to the ability to periodically recalculate routes in a data communication network depending on network load or other factors.
Fundamentally, traffic engineering is a function of routing. However, existing destination-based routing protocols, such as the protocols already mentioned, make it very difficult—if not impossible—to support explicit routes.
One current approach proposed by the Internet Engineering Task Force (“IETF”) toward supporting explicit routes in traffic engineering applications is to use the multiple label switching (“MPLS”) technique and the Resource Reservation Protocol (“RSVP”) to set up explicit routes. MPLS and RSVP are not discussed in detail in this document, so as not to overcomplicate the present discussion. For the purposes of the present discussion, however, suffice it to say that in both the MPLS and RSVP techniques, a technique known as “tag switching” can be used, wherein explicit routes are cached in Tag Information Base (“TIB”) entries. In general, tag switching techniques work as follows: at the edge of a tag-switched network, a tag is applied to each packet A tag has only local significance. When a packet is received by a tag switch (e.g., a router or ATM switch with tag switching software), the switch performs a table look-up in the TIB. Each entry in the TIB consists of an incoming tag and one or more subentries of the form: outgoing tag, outgoing interface, outgoing link layer information. The tag switch replaces the tag in the packet with the outgoing tag and replaces the link layer information. The packet is then sent out on the given outgoing interface. It should be noted that the TIB is built at the same time that the outing tables are populated, not when the tag is needed for the first time, which allows flows to be switched starting with the first packet.
According to the tag switching techniques that can be implemented with MPLS and RSVP, at each node in a route, an incoming packet is forwarded along the explicit route to the next hop stored in the TIB whose index tag matches the tag in the packet header. However, the main problem with this approach is its inherent complexity. Both RSVP and MPLS are very complicated and computationally intensive protocols, and they must be extended in order to support explicit routes. Moreover, to support traffic engineering, a network must support both the RSVP and MPLS protocols, and this can significantly limit the scope of traffic engineering available. This is because new service models, such as differentiated services, do not require the hop-by-hop signaling provided by RSVP. Moreover, this approach to supporting dynamic traffic engineering (i.e., making routing decision on a “per-flow” basis), is inflexible and expensive. Moreover, as is known to those of ordinary skill in the art, the current approach is not scalable because the size of the TIB that is required for a network with a given number of nodes grows exponentially as a function of the number of nodes.
The present invention provides a technique for scalable and dynamic rerouting that significantly reduces the complexity of traffic engineering as currently proposed in the IETF. As described herein, according to aspects of the present invention, a global path identifier is assigned to each explicit route in a data communication network. In one embodiment, this global path identifier is inserted in the optional field of an IP packet header, and is used in selecting the next hop by a router's forwarding engine. As another example, the global path identifier can be inserted as a label in MPLS systems. Explicit routes can be selected either by a policy server or by ingress routers. When encountering a new selected path, an ingress router sends an explicit object to downstream nodes of the path to set up explicit routes by caching the next hop in an Explicit Forwarding Information Base (“EFIB”) table in each router along the route. Two explicit routes that merge at a network node will share the same entry in the EFIB tables in all downstream nodes. Ingress routers maintain an Explicit Route Table (“ERT”) table that tracks the global path identifier associated with each flow through the data communication network. Multiple flows using the same path can be implemented by sharing the same global path identifier in the table. In case of sudden network load changes, rerouting can be performed by changing the global path identifier associated with those flows that need to be rerouted, and by then transmitting a new path object to downstream nodes.
Compared with the existing approach, the technique according to aspects of the present invention is routing protocol independent, scalable, and dynamic, and it can support both class-based and flow-based explicit routes. These and other features and advantages of the present invention will be presented in more detail in the following specification of the invention and in the associated figures.