1. Field of the Invention
The current invention relates to data networks, and in particular, to the distribution of network-topological messages in a data network.
2. Description of the Related Art
A data network enables the transport of data packets from a source end-point to a destination end-point. A typical data network comprises multiple nodes, known as routers, that route the data packets from the source to the destination. Note that a network may be defined so as to exclude from the network nodes that are nevertheless connected to the network. Thus, external nodes may be connected to nodes within the network, where the external nodes are not part of the network. Additionally, a single node may belong to more than one network. Nodes typically comprise a processor, memory, and one or more communication ports.
Data packets include destination addresses in their headers, which allow routers to determine how to forward the data packets. A typical router maintains a routing table, also known as a routing information database (RIB), to store network-topology information to allow the router to forward data packets towards the packets' corresponding destinations. Routing tables are typically updated dynamically and automatically to reflect changes in network topology and performance.
Routers in a particular data network are compatible with the particular routing protocol of that data network. A typical routing protocol includes a methodology for routers to exchange network topology information. The typical routing protocol also includes an algorithm for a router to execute for calculating a best path for routing a given data packet to a destination, where the best path is based on the contemporary topology information.
There are two major classes of routing protocols: vector protocols and link-state protocols. Examples of vector protocols include RIP (Routing Information Protocol), IGRP (Interior Gateway Routing Protocol), and EIGRP (Enhanced Interior Gateway Routing Protocol). Examples of link-state protocol include OSPF (Open Shortest Path First) and IS-IS (Intermediate System to Intermediate System). OSPF and IS-IS are currently maintained by working groups in the IETF (Internet Engineering Task Force). Link-state protocols are generally considered to be more robust and to allow faster convergence times than vector protocols, particularly in large networks. Therefore, link-state protocols are typically preferred in larger networks. OSPF is often preferred for enterprise networks, while IS-IS is often preferred for core networks, such as ISP (Internet Service Provider) backbone networks.
The IS-IS protocol can be used to support any OSI (Open System Interconnection) layer-3 protocol such as, e.g., IP (Internet Protocol) or CLNP (Connectionless Network Protocol). A description of the use of the IS-IS protocol with TCP/IP (Transfer Control Protocol/Internet Protocol) can be found in IETF RFC (Request for Comment) 1195, titled “Use of OSI IS-IS for Routing in TCP/IP and Dual Environments,” incorporated herein by reference in its entirety. Among the message types supported by the IS-IS protocol are LSP (link-state packet; also link-state PDU (packet data unit)), CSNP (Complete Sequence Number PDU), and PSNP (Partial Sequence Number PDU).
An LSP contains information about the links at the LSP's originating node. A link is a (direct or virtual) connection to another node and may be identified by a communication port on the originating node and a peer communication port on the other node. Links can go up, come down, or be otherwise modified. LSPs are sent out by an originating node in order to provide information to the other nodes in a network about the condition and status of the links at the originating node. Information from received LSPs is maintained by a receiving node in the receiving node's link-state database, where each link-information entry of the link-state database corresponds to a received or generated LSP. Thus, received LSPs are used to update a receiving node's link-state database. The operating details of particular implementations of link-state databases are implementation-specific and may vary.
FIG. 1 shows the format of typical LSP 100, with field sizes in bytes appearing on the right side. LSP 100 includes PDU-type field 101, remaining-lifetime field 102, LSP-ID field 103, sequence-number field 104, checksum field 105, and TLV (tag-length-value) section 106. PDU-type field 101 identifies the PDU as an LSP. Remaining-lifetime field 102 specifies the length of time that the information in LSP 100 should be considered valid. LSP-ID field 103 identifies the originating node of the LSP. Sequence number 104 identifies the sequential number of the LSP from the LSP-originating node. LSP-originating nodes increment the sequence number for generated LSPs having new information to alert receiving nodes that corresponding link-state database information should be updated. Nodes may re-send LSPs with unchanged information in response to requests, as refreshers, or for other reasons. Thus, an LSP-originating node may generate multiple, substantially identical LSPs even if that node has no new link-state information to report. Checksum 105 is a checksum value used to determine if there are transmission errors in LSP 100. TLV section 106 is the payload of LSP 100 and may contain a variety of parameters, each identified by a parameter tag, a parameter length, and a parameter value.
A CSNP contains a listing and summary of all the LSPs maintained in the link-state database of the CSNP-originating node. CSNPs are used to synchronize the link-state databases of neighboring network nodes. A summary entry for an LSP in a CSNP includes the remaining lifetime, the LSP-ID, the sequence number, and the checksum. Based on these parameters, a CSNP-receiving node can determine whether synchronization of information is necessary, in which case the nodes can synchronize by the transmission of the appropriate LSP(s). A complete summary of a link-state database may be divided and sent over multiple CSNPs if a single CSNP packet is not sufficiently large to accommodate the complete summary.
If, for example, a fully connected (i.e., where each node is connected to every other node by a corresponding link) IS-IS network has 300 nodes, then it will have 44,850 (=300*299/2) links. Thus, each node in the network will have at least 44,850 entries in its link-state database, and each synchronization will require sending CSNPs having at least 44,850 entries. If the CSNP is sent over Ethernet where each packet is limited to about 1500 octets of data, then each CSNP packet can contain about 90 entries, meaning that about 500 packets will be required for the transmission of just one CSNP. This amount of traffic for a synchronization can degrade network performance.
A PSNP contains a listing and summary of a subset of the LSPs in the link-state database of the PSNP-originating node. PSNPs are used to acknowledge receipt of one or more LSPs and to request one or more LSPs from a neighboring node.
IS-IS nodes distribute LSPs by flooding. When a node determines that the status of one or more of its links has changed, it generates a corresponding LSP and sends it to all the nodes to which it is linked on the network (i.e., the node's neighbors). When a node receives an LSP from a sending node, the receiving node compares the LSP's LSP-ID and sequence number to the LSP-ID and sequence number in the receiving-node's link-state database. If the LSP-ID is not stored in the link-state database, then the receiving node adds the information of the received LSP to the receiving node's link-state database. The receiving node then forwards the LSP to all its neighbors, except the sending node. If the LSP-ID is already in the receiving node's link-state database and the sequence number of the received LSP is the same as the stored sequence number for the corresponding LSP-ID, then the receiving node determines that the LSP contains no new information and ignores the LSP.
If the sequence number of the received LSP is higher than the stored sequence number for the corresponding LSP-ID, then the receiving node determines that the LSP contains new information and (i) updates its link-state database based on the received LSP and (ii) forwards the LSP to all the nodes to which it is linked, other than the node which sent the receiving node the LSP. If the sequence number of the received LSP is lower than the stored sequence number for the corresponding LSP-ID, then the receiving node determines that the sending node's link-state database needs updating, and the receiving node sends its stored LSP information for the corresponding LSP-ID (with the higher sequence number) from its link-state database to the sending node. This flooding process helps guarantee that new LSPs are distributed to all the nodes in a network so that all those nodes have up-to-date link-state information.
FIG. 2 shows an illustration of exemplary LSP flooding in fully connected network 200. A fully connected network is a network whose nodes have links to all the other nodes in the network. Note that these links can be virtual (a.k.a. logical) connections and do not have to be direct (a.k.a. physical) connections. Similarly, a highly connected network is a network where most of the nodes have links to most of the other nodes in the network. Network 200 comprises interconnected nodes 201, 202, 203, 204, 205, and 206. In step 1 of FIG. 2(a), node 201 originates a new LSP and forwards it to all the nodes to which it is linked, i.e., nodes 202, 203, 204, 205, and 206. In step 2 of FIG. 2(b), each of nodes 202, 203, 204, 205, and 206 forwards the LSP to every node to which it is linked, other than the node from which it received the LSP. Thus, each of nodes 202, 203, 204, 205, and 206 forwards the LSP to four other nodes (e.g., node 206 forwards the LSP to nodes 202, 203, 204, and 205). In effect, each of nodes 202, 203, 204, 205, and 206 receives and processes the same LSP five times. Note that, when a node (e.g., 202, 203, 204, 205, and 206) receives subsequent copies of the same LSP, the node will not forward the LSP again.
Flooding in a fully or highly connected network can become a growing concern as the number of nodes increases. For example, if a node in a fully connected network of 300 nodes originates a new LSP, then every other node in that network will receive and process 299 copies of that LSP—one from the originating node, and one from each of the 298 other nodes in the network. Processing that many LSPs can noticeably degrade the performance of a node. Even more problematic is the situation where one of the 300 nodes fails. When a node fails, its neighbors detect that their respective connecting links to the failed node are not operating. Upon the detection of the respective link failure, each of the failed node's 299 neighbors originates an LSP to forward to the 298 other nodes indicating that the respective link to the failed node has failed. Each LSP will be flooded through the network as per the algorithm outlined above. Thus, when the one node fails, each of the other nodes will receive close to 90,000 LSPs (˜298*298). Trying to process that many LSPs in a short period can put a serious, or even debilitating, strain on a node's processor.
As noted above, a fully connected network can be formed even where each individual node does not have direct (i.e., intermediary-free, physical-layer) connections to all the other nodes. In other words, nodes in a fully connected network can be linked through virtual connections. Two nodes in a network are virtually connected at a logical layer when the two nodes are physically connected via one or more intermediary nodes, where the logical layer is unaware of the physical connections involving the one or more intermediary nodes. For example, MPLS (Multi-Protocol Label Switching) is a protocol-independent packet-forwarding OSI layer-2 technology (sometimes considered a layer-2.5 technology) that allows for the rapid and direct-seeming transmission of layer-3 (e.g., IP) packets between MPLS nodes. This is accomplished partly by pushing labels onto layer-3 (e.g., IP) packets and using the labels to quickly route the resultant MPLS packets. In an optical network, particular wavelengths can be used as labels for protocol-independent packet forwarding.
FIG. 3 shows one possible physical implementation of fully connected network 200 of FIG. 2. Path 201a physically connects nodes 201 and 203. Paths 203a, 205a, 206a, 204a, and 202a physically connect (i) nodes 203 and 205, (ii) nodes 205 and 206, (iii) nodes 206 and 204, (iv) nodes 204 and 202, and (v) nodes 202 and 201, respectively. Using a protocol-independent packet-forwarding technology, such as MPLS, virtual or logical connections can be established among the nodes of network 200 which would appear as links to layer-3 network systems. Thus, to IP network 200, the six nodes appear fully connected. For example, node 201 would be able to transmit an LSP to node 206 where the LSP would be physically transmitted via nodes 203 and 205, but without any processing, or even awareness, by layer-3 network systems on nodes 203 and 205.
As noted above, flooding messages in a highly connected network can put a deleterious strain on system performance. One proposal to mitigate the problem is the establishment of mesh groups as presented in RFC 2973, titled “IS-IS Mesh Groups,” incorporated herein by reference in its entirety. A mesh group is a group of connections among nodes, where the connections are administratively configured to belong to a particular group. A mesh group can be used to avoid flooding LSP packets by forwarding LSPs only on a subset of ports, instead of substantially all of a node's ports. Limiting the number of LSP packets sent out by a node reduces the detrimental effects of flooding. It should be noted that mesh-group limitations apply to the distribution of LSP packets. Link-state-limited links remain fully active and available for the transmission of bearer or other types of packets.
The mesh groups described in RFC 2973 can be set up by setting the links in the network to one of three settings: meshBlocked, meshInactive, or meshSet. By default, links are in the meshInactive state, where the ports defining the links behave as though mesh groups have not been set up. When a node receives an LSP from a meshInactive link, the node forwards the LSP via all other links which are not in a meshBlocked state. A node will forward any received LSPs via all other meshInactive links. Original LSPs will be transmitted via all meshInactive links. No LSPs are forwarded via meshBlocked links. No LSPs should come in from a meshBlocked link since the corresponding node should not forward LSPs via the meshBlocked link. Links in the meshSet state have an associated parameter, meshGroup, which identifies a corresponding mesh group. If a node receives an LSP from a meshSet link, then the node will forward the LSP via all the meshInactive links and on meshSet links that have a meshGroup parameter different from the meshGroup parameter of the ingress link. For example, if a node receives an LSP from a meshSet ingress link whose meshGroup is 1, then it will not forward the LSP via any meshSet links whose meshGroup is 1. MeshSet meshGroups are complicated and not often used.
FIG. 4 shows an illustration of an exemplary operation of a mesh group in network 400 in accordance with RFC 2973. Network 400 comprises interconnected nodes 401, 402, 403, 404, 405, and 406. The FIG. 4 links in bold belong to meshGroup 1, while the dashed links belong to meshGroup 2. No links in network 400 are in the meshBlocked or meshInactive state. If node 401 generates an LSP, then, as illustrated in step 1 of FIG. 4(a), node 401 forwards the LSP on all of its links. The other nodes then forward the LSP received from node 401 via links that belong to meshGroups that both (1) are different from the meshGroup of the ingress link and (2) did not already transmit that LSP. Node 404, for example, received the LSP from node 401 via a group-2 link, and forwards the LSP via its group-1 links to nodes 402 and 406. After step 2 of FIG. 4(b), the flooding of network 400 is complete because every node has sent or received the LSP via every meshGroup through which it is linked. As can be seen, because of the meshGroups in network 400, fewer LSPs are transmitted and processed in network 400 than in network 200 of FIG. 2.
RFC 2973 also describes another use of meshGroup parameters sometimes called “poor man's mesh groups,” herein referred to as flow-through mesh groups (FTMGs). In a flow-through mesh group, the meshSet state is not used. Instead, certain links are set to meshBlocked to prune the flooding topology. This creates a group of links through which LSPs flow.
FIG. 5 shows an illustration of an exemplary operation of a flow-through mesh group in network 500. Network 500 comprises six interlinked nodes 501-506. The links in bold are set to meshInactive and belong to flow-through mesh group 507. The dashed links are set to meshBlocked. If node 501 generates an LSP, then, as shown in step 1 of FIG. 5(a), node 501 forwards the LSP via its meshInactive ports to nodes 502 and 503. Then, as shown in step 2 of FIG. 5(b), nodes 502 and 503 forward the LSP via their meshInactive ports, other than the ingress ports, to nodes 504 and 505, respectively. Nodes 504 and 505 similarly then each forward the LSP via their meshInactive ports, other than the ingress ports, to node 506, as shown in step 3 of FIG. 5(c). After step 3, the flooding of network 500 is complete. As can be seen, because of flow-through mesh group 507 in network 500, fewer LSPs are transmitted and processed in network 500 than in network 400 of FIG. 4. However, the LSP-update system of network 500 is less robust than that of both network 200 of FIG. 2 and network 400 of FIG. 4, because, if any two links of flow-through mesh group 507 fail, then at least one node will no longer get LSPs from the other nodes.
The mesh groups proposed by RFC 2973 need to be manually designed and implemented by the network administrator. Subsequent maintenance of the mesh groups is also performed manually by the network administrator. The prior-art systems are prone to set-up errors and to slow reactions to network problems and/or evolving network requirements.