The present invention is directed to communications networking. It is directed particularly to a tag distribution protocol employed on a tag-switching network.
Internetworking, Routers and the Internet Protocol
Two local area networks, LAN A 10 and LAN B 20, interconnected through a xe2x80x9cbackbonexe2x80x9d of routers 2, 4, 6, 8 are shown in FIG. 1. A router may have a plurality of interfaces to one or more local networks or to other routers. LAN A includes a router 2 and three host devices 14, 16, 18 which can communicate directly with each other over the LAN A bus 12, and LAN B includes a router 8 and three host devices 24, 26, 28 which can communicate directly with each other over the LAN B bus 22. Two directly connected, or linked, devices communicate through the exchange of link-layer, e.g., Ethernet, communications packets.
The exchange of data between two indirectly connected devices, e.g., HOST A114 of LAN A and HOST B124 of LAN B, is typically accomplished at the network layer using an Internet Protocol (IP) datagram. The IP datagram is typically forwarded in the payload field of link-layer communications packets that are exchanged between the backbone routers. The use of an IP datagram allows for the routing of data between network devices that do not have a link-layer connection and, therefore, cannot exchange link-layer packets with each other.
An Ethernet packet 200 having an IP datagram in its payload field 206 is shown in FIG. 2. The IP datagram is encapsulated between an Ethernet header field 202 and a trailing CRC field 204. The Ethernet header field 202 includes a type field 203 that specifies that the payload field 206 contains an IP datagram. The IP datagram includes an IP payload field 208 preceded by an IP header field 210. The IP header field 210 includes a source IP address field 212 (containing IP address xe2x80x9cXxe2x80x9d) and a destination IP address field 214 (containing IP address xe2x80x9cYxe2x80x9d). The source address field 212 identifies the originator of the IP datagram, e.g. HOST A114, and the destination address field 214 identifies the intended recipient of the IP datagram, e.g. HOST B124.
Routers commonly employ some type of discovery mechanism to automatically identify and maintain links to other routers and thereby avoid the need for explicit network configuration. Under a discovery mechanism, a router periodically broadcasts from each interface a special type of link-layer packet, typically referred to as a Hello packet, to inform other routers of its presence in the network. The router xe2x80x9cdiscoversxe2x80x9d a link to another router when a Hello packet is received at one of its interfaces. To verify the ongoing operation of a particular link, the router establishes a hello hold timer associated with the linked router and resets the timer each time a subsequent Hello packet is received at the interface from the linked router. If the router fails to receive a subsequent Hello packet before the timer expires, it assumes that the link is no longer available. The failure to receive a new Hello packet may be due to a poor link connection between the two routers, or the linked router may have failed or perhaps decided for some reason to disable that particular interface.
A conventional backbone router typically determines the link over which the IP datagram is to be forwarded by referring to a forwarding table, which contains routing information learned from neighbor routers and maintained by the router. Using the xe2x80x9cYxe2x80x9d address in the destination IP address field 214, the router performs a longest match search against IP addresses stored in the table. Unfortunately, because the IP address space is so large, the forwarding table may have to be very large. More importantly, a longest match search through the forwarding table can be time consuming and result in the expenditure of valuable router processing resources and a slowing of the movement of packets through the network.
A Tag-Switching Network
A technique known variously as xe2x80x9ctag-switchingxe2x80x9d or xe2x80x9clabel-switchingxe2x80x9d is one way of avoiding the longest match searches. Although the invention to be described below is not limited to any particular implementation of tag switching, one popular method for implementing it is called Multi-Protocol Label Switching (MPLS) and is described in the above-cited Rekhter et al. application.
An Ethernet packet 300 carrying a tagged IP datagram in its payload field 306 is shown in FIG. 3. The type field 305 of the Ethernet packet is used to identify the payload contents as a tagged datagram and thus distinguish it from a normal IP datagram. A tag stack field 320 is prepended to the IP payload field 306 and is comprised of one or more xe2x80x9ctags,xe2x80x9d or xe2x80x9clabels,xe2x80x9d employed for forwarding. In this case, the tag stack field 320 contains a single tag stack entry 322. A tag-switching router uses the contents of the tag field 324 in place of the destination address 303 to determine the forwarding route of the packet.
FIG. 4 illustrates the exchange of an IP datagram over one type of tag-switching is network. For simplicity, only the destination IP address field 314 (containing IP address xe2x80x9cD1xe2x80x9d) and the IP payload field 308 (containing xe2x80x9cDATAxe2x80x9d) of the IP datagram are shown in FIG. 4. The tag-switching network is comprised of a first tag-switching edge router TE1 interfacing to a first router R1 of a first local network; two tag-switching transit routers T1, T2 connecting the tag-switching edge router TE1 to a second tag-switching edge router TE2; and tag-switching edge router TE2 interfacing to a second router R2 of a second local network.
We assume that router R2 sends tag-switching edge router TE2 an IP datagram within an Ethernet packet of the type depicted in the second row of FIG. 2. When tag-switching edge router TE2 receives the IP datagram from router R2, it prefixes a tag T1 that identifies an entry in the forwarding table of the next router, i.e., the first transit router TR2, in the backbone path. When the transit router TR2 receives the IP datagram, it uses the tag T1 to identify the location in its forwarding table that specifies the forwarding link to the edge router TE1; i.e., the transit router TR2 does not have to perform a time-consuming longest-match search. It then replaces the tag T1 with the replacement tag T2 that identifies an entry in the forwarding table of the second transit router TR1 in the backbone path and forwards the IP datagram. (We assume that, as in the typical case, there are several transit routers in the backbone path, although in some configurations there may be none. All transit routers, except the last transit router in the backbone path, perform in a manner similar to that of transit router TR2.) When the second transit router TR1, which is also the last transit router in the backbone path, receives the IP datagram, it uses tag T2 to identify an entry in its forwarding table specifying the forwarding link, removes tag T2, and then forwards the untagged IP datagram. When the edge router TE1 receives the IP datagram, it forwards the data packet to R1 in the conventional manner.
The ATM Protocol
Although the tag-over-Ethernet protocol illustrated in FIG. 3 is typical for packets exchanged between tag-switching routers, it is not the only protocol that such routers may employ. The protocols employed on some links types are actually somewhat more complicated than the protocol depicted in FIG. 3. Moreover, routers that communicate with each other over a point-to-point link, i.e., not by way of a shared medium, typically would employ a link-layer protocol, such as SLIP or PPP, that is different from the Ethernet protocol just described. An implementation that is particularly desirable for high-capacity links employs Asynchronous Transfer Mode (xe2x80x9cATMxe2x80x9d) switches.
An ATM frame 500 having an IP datagram in its payload field 507 is shown in FIG. 5. The IP datagram field 506 and a tag stack field 520 of the payload field 507 are similar to the IP datagram field 306 and tag stack field 320 encapsulated by the Ethernet header 302 and trailer 304 of FIG. 3. The only difference is that the tag field 524 of the single tag stack entry 522 contains a xe2x80x9cDON""T CARE,xe2x80x9d which indicates that the tag""s contents do not matter.
The reason why the tag""s contents do not matter is that the routing decisions, which are based on those contents when the tagged packet arrives on a non-ATM link, are instead based on an ATM VPI/VCI field 546 found in the cell header field 544 of an ATM xe2x80x9ccellxe2x80x9d 540 when the tagged packet arrives on an ATM link. From the point of view of an ATM client, the ATM frame 500 is the basic unit of transmission, and it can vary in length to as much as 64 Kbytes of payload. (Those skilled in the art will recognize that there are also other possible ATM frame formats, but FIG. 5""s third row depicts one, known as xe2x80x9cAAL5,xe2x80x9d that would typically be employed for user data.) From the ATM switch""s point of view, though, the basic transmission units are fixed-size cells into which the frames are divided. The cell header field 544, shown in detail in the first row, also includes a PTI field 548. One purpose of the PTI field 548 is to indicate whether its cell is the last one in a frame. If it is, its last eight bytes form the frame trailer field 504. Among other things, the trailer field 504 indicates how much of the preceding cell""s payload field 542 is comprised of actual payload, as opposed to padding used to complete a fixed-size cell.
The VPI/VCI field 546 is of particular interest to the present discussion. As is well known to those skilled in the art, ATM systems organize their routes into xe2x80x9cvirtual channels,xe2x80x9d which may from time to time be grouped into xe2x80x9cvirtual paths.xe2x80x9d Each switch associates a local virtual path/virtual channel indicator (VPI/VCI) with a channel or path that runs through it. When an ATM switch receives a cell, it consults the cell""s VPI/VCI field 546 to identify by table lookup the interface through which to forward the cell. It also replaces that field""s contents with a value indicated by the table as being the next switch""s code for that path or channel, and it sends the resultant cell to the next switch. In other words, the function performed by the VPI/VCI field 546 enables it to serve as the tag and, as a result, a tag-switching interface implemented as an ATM switch can ignore the tag field 524, on which other implementations for other links rely.
Tag Distribution
As described above, a tag-switching router forwards a tagged packet to a peer based on the contents of a forwarding table entry pointed to by the top tag in the pre-pended tag stack. These contents are referred to as a tag binding because they xe2x80x9cbindxe2x80x9d the tag to a particular route. It is the peer router that actually xe2x80x9cconstructsxe2x80x9d the forwarding table through a tag distribution procedure that is executed upon establishment of a tag-switching link. One method for distributing tag bindings is described in an Internet Engineering Task Force (IETF) draft entitled xe2x80x9cTag Distribution Protocolxe2x80x9d, draft-doolan-tdp-spec-01.txt, May 1997, which constitutes appendix B of above-cited Rekhter et al. application.
Tag-switching router peers establish a tag distribution protocol (TDP) session on a transport control protocol (TCP) connection to distribute, or xe2x80x9cadvertise,xe2x80x9d tag bindings to each other. A TDP protocol data unit (PDU) is used for the transmission of one or more session messages. Each session message is formatted as a protocol information element (PIE), and one or more PIEs are carried within the payload field 628 of the TDP PDU, as illustrated in the second row of FIG. 6. The TDP PDU is placed within the payload field 618 of a TCP segment, which is transmitted to the peer in the payload field 608 of an IP datagram. The destination IP address field 614 of the IP datagram header 610 contains the IP address of the receiving router interface, and the source IP address field 612 contains the IP address of the transmitting router interface.
The TDP PDU includes a header field 630 comprised of a version field 632, a length field 634, a TDP identifier consisting of an router ID field 636 and a TDP instance field 637, and a field 638 reserved for future use. The version field 632 is a two-octet unsigned integer specifying the version number of the tag distribution protocol. The length field 634 is a two-octet integer specifying the total length of the PDU in bytes, excluding the version and length fields. The TDP identifier is six octets in total length. The router ID field typically contains a 32-bit xe2x80x9cstablexe2x80x9d address, i.e., one that is not lost when interfaces or physical connections go down, and it is often the IP address of the xe2x80x9clogicalxe2x80x9d or xe2x80x9cloopbackxe2x80x9d interface. By convention, a router encodes the same router ID in all its TDP messages. The two-octet TDP instance field 637 represents a particular TDP session between a router and a peer.
A TDP PIE has a type-length-value (TLV) structure, as illustrated by the Bind PIE 650 of the third row of FIG. 6. The two-octet type field 652 specifies the contents of the value field. The two-octet length field 654 specifies the length of the value field in octets. The value field 660, depending upon the message type, may include one or more mandatory parameters and one or more optional parameters. The value field 660 of the Bind PIE 650 includes a binding list field 670 consisting of a plurality of tag binding entries 680 having tag subfields 684 containing the above-described tags.
The peers will maintain the TDP session for as long as the tag-switching exchange of packets is to occur between them. The tag bindings themselves may apply to one or more tag switching links between the peers.
Multi-Linked Tag-Switching Router Peers
A tag-switching router may have more than one tag-switching enabled interface connecting it to a peer, and FIG. 7 illustrates two routers P1, P2 having four links L1, L2, L3, L4. As mentioned above, a router uses an incoming tag in place of the destination address and interprets the tag by referring to a tag binding distributed by the next router in the destination route of a received data packet. An interface can be classified in terms of its ability to make the same interpretation on the incoming tag as made by other interfaces. Interfaces of a first type, such as an Ethernet, PPP or fiber distributed data interface (FDDI) interface, each make the same interpretation and thus may easily share the same set of tag bindings. Interfaces of a second type, however, such as an ATM interface or frame relay interface, each make a unique interpretation of the incoming tag and therefore require a dedicated set of tag bindings. For example, the ATM switching operation, which is usually implemented in the interface hardware to achieve maximum performance, maintains a hardware forwarding table that contains routing information having a physical significance. As a result, in general, each ATM interface on a router places a different interpretation of the same incoming VPI/VCI tag value.
Accordingly, the tag-switching ATM links L1, L4 of FIG. 7 must each employ dedicated tag bindings which are different from the platform-wide tag bindings that may be shared by the two Ethernet links L2, L3. Unfortunately, the conventional tag distribution protocol provides for either the establishment of a single TDP session for the distribution of a set of platform-wide tag bindings, or the establishment of a separate TDP session for each and every tag-switching link. This results in a needless session redundancy when most, but not all, of the interfaces are sharing the same tag bindings.
I have recognized that there is a way for a tag-switching router to establish a single TDP session for the distribution of a platform-wide tag space when at least one of its interfaces requires an interface specific tag space. This invention is a particularly simple improvement to the tag distribution protocol (TDP) that introduces the notion of a xe2x80x9ctag space.xe2x80x9dA tag space is defined as the set of incoming tags that a router has assigned to an interface, and it may be either an interface-specific tag space or a platform-wide tag space assigned to one or more interfaces. Each tag space is uniquely identified by a TDP identifier consisting of a router ID and a tag space ID. The router ID is typically an IP address that uniquely identifies the router on a tag-switching network, and the tag space ID is a two-octet number that uniquely identifies a tag space within the router. A router distributes, or xe2x80x9cadvertises,xe2x80x9d a set of corresponding outgoing tag bindings to a peer during a TDP session, and a separate session is established for each tag space.
Tag-switching routers incorporating the present invention employ a new Hello protocol information element (PIE) in a TDP discovery mechanism to dynamically identify and maintain links to peers. Each router periodically multicasts a specific Hello PIE from each interface enabled for tag-switching. The Hello PIE is carried in the payload of a TDP protocol distribution unit (PDU), and the TDP PDU is multicast in the payload of a User Datagram Protocol (UDP) datagram. The TDP instance field of the TDP identifier found in the conventional TDP PDU header is replaced by a newly-defined tag space ID field that specifies the tag space the router has assigned to the interface, with the null value (00) indicating a platform-wide tag space. When a router receives a Hello PIE from a peer at one of its tag-switching enabled interfaces, it records the peer TDP identifier in a record associated with the interface and establishes a corresponding hello hold timer to create a link adjacency.
The router and the peer establish a TDP session on a transport control protocol (TCP) connection for the exchange of tag bindings. The router transmits a Bind PIE to the peer to advertise a set of tag bindings, each tag binding associating a destination address with an incoming tag that the peer appends when forwarding to the router data packets with that destination. The Bind PIE is carried in the payload of a TDP PDU, and the TDP PDU is transmitted in the payload of a TCP segment. The router uses the TDP identifier of the TDP PDU header to specify the tag space to which the advertised tag bindings correspond. The router receives a Bind PIE from the peer containing a set of tag bindings and the router matches the TDP identifier of the TDP PDU header to the TDP identifier found in a link adjacency record to thereby associate these learned tag bindings with one or more of its interfaces to the peer. A data packet received at the router and having a destination address bound to one of these learned tag bindings is appended with a corresponding outgoing tag and then forwarded to the peer from an associated interface.