The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
In computer networks such as the Internet, packets of data are sent in a network comprising network components from a source to a destination component via a network of components including links (communication paths such as telephone or optical lines) and routers directing the packet along one or more of a plurality of links connected to it according to one of various routing protocols.
For example referring to FIG. 1 which represents an illustrative network designated generally 100, a source network component such as a host 102 sends a data packet to a destination network component for example one of hosts 104, 106, 108, 110 or 112. The packet is routed by a router 114 via a network such as the Internet 116 and a further router 118 or 120 as appropriate serving the destination component. The source and destination hosts can be any appropriate components for example voice components such as a voice over IP (VoIP) phone, a PC and so forth.
One routing protocol commonly used for routing data within data communication networks is the link state protocol. The link state protocol relies on a routing algorithm resident at each node. Each node on the network advertises, throughout the network, links to neighboring nodes and provides a cost associated with each link, which can be based on any appropriate metric such as link bandwidth or delay and is typically expressed as an integer value. A link may have an asymmetric cost, that is, the cost in the direction AB along a link may be different from the cost in a direction BA. Based on the advertised information in the form of a link state packet (LSP) each node constructs a link state database (LSDB), which is a map of the entire network topology, and from that constructs generally a single optimum route to each available node based on an appropriate algorithm such as, for example, a shortest path first (SPF) algorithm. As a result a “spanning tree” is constructed, rooted at the node and showing an optimum path including intermediate nodes to each available destination node. The results of the SPF are stored in a routing information base (RIB) and based on these results the forwarding information base (FIB) or forwarding table is updated to control forwarding of packets appropriately. When there is a network change an LSP representing the change is flooded through the network by each node adjacent the change, each node receiving the LSP sending it to each adjacent node.
The LSPs are sent, for example, by routers such as components 114, 118, 120 and advertise the network components served by the router. For example router 118 may advertise that it serves components 104, 106, 108 and so forth.
Various alternative routing systems include distance vector routing and multi-protocol label swapping (MPLS) whereby the path taken by a packet is partially or fully predetermined. However in all cases, when a data packet for a destination arrives at a forwarding component such as a router, the forwarding component must identify the next component (“next hop”) along the route for forwarding the packet to the destination. The next hop is obtained from the FIB as described above. The next node repeats this step and so forth.
The destination for a packet is expressed as an internet protocol (IP) address and hence part of the forwarding operation is to obtain the forwarding information such as next hop details for route to the destination address.
The manner in which the forwarding information is obtained has developed as a result of the structure of IP addresses. Under the current addressing scheme set up under Internet protocol version 4 (IPv4), IP addresses have 32 bits divided into four octets and commonly represented as xxx.xxx.xxx.xxx where xxx comprises the decimal value corresponding to the binary value of the octet and ranging from zero to 255 for each octet.
Various address assignment schemes have been adopted and commonly a group of addresses is assigned to a group of components sharing a common routing policy (for example an autonomous system (AS)) such that all components within the AS share a common “prefix”, that is, the first n most significant bits of the IP address, but are distinguished by the remaining (32−n) bits which are distinct for each component. Accordingly, where a router serves as the ingress point to an AS, then external components can simply forward packets for any component in the AS to the router serving it, which will then forward the packets appropriately within the AS for the destination component. In those circumstances the serving router advertises its availability, is commonly termed a prefix and is represented by a prefix/mask of the form xxx.xxx.xxx.xxx/n where n is the mask length representing the number of bits in the prefix. For example, an address 10.0.0.0/8, or 10/8 for short, means that the prefix is formed of the first 8 bits having the value 00001010. As a result components served by the prefix can have addresses in the range 10.1.1.1 to 10.255.255.255.
Reverting for example to FIG. 1, router 118 has address 10.0.0.0 and serves an AS 122 above component 104 in network 122 has address 10.1.1.1, component 106 has address 10.1.1.2 and component 108 has address 10.1.1.3. Similarly router 120 has address 11.0.0.0 and serves an AS 124, and it serves component 110 having an address 11.1.1.1 and component 112 having an address 11.1.1.2.
When a packet arrives at a router for example in the network 116, for destination xxx.xxx.xxx.xxx, the node needs to identify which prefix to send the packet to (i.e. the prefix serving that address) and then obtain the forwarding information including identifying the next hop in the route to that prefix, however that next hop may be determined.
However in practice it is not necessarily sufficient simply to identify the prefix within the destination address and send it to the corresponding router. Firstly this is because it is necessary also to identify the appropriate prefix mask, that is, which part of the address in fact comprises the prefix. Secondly, as a result of the manner in which some networks are constructed, components with an AS nominally attached to a first router may nonetheless be served by another router. For example referring to FIG. 1, component 108 having address 10.1.1.3 in network 122 generally attached to router 118 is in fact advertised by router 120 such that it would not be appropriate to simply direct packets destined for 10.1.1.3 to the next hop of the route to 10/8.
Accordingly various forwarding database structures and steps are conventionally implemented to ensure that the correct forwarding information is derived for each destination address. In particular longest prefix matching is implemented as can be understood with reference to FIG. 2 which is a schematic diagram of a prefix tree constructed at some router in the network 116 in FIG. 1. The prefix tree illustrates at a conceptual level the manner in which longest prefix matching is put into practice comprises a forwarding database containing prefix records indexed by network addresses and masks and containing forwarding information allowing data to be forwarded to its destination. As can be seen, the prefix tree is rooted at a default entry 0/0, reference numeral 200. The tree then branches to (or “covers”) a subset of nodes or entries representing a prefix that is present in the network and having a branch for each prefix that prefix/mask 1/8, 10/8, 11/8 and 255/8 denoted here by reference numerals 202, 204, 206, 208. Each entry then branches into prefixes that it covers. For example 1/8 (not shown in the diagram) might cover any of 1.1/16 through to 1.255/16, reference numerals 210, 212. Each downstream entry therefore has a direct cover—the prefix from which it branches, and further covering nodes upstream of the direct cover as appropriate.
Reverting to the topology shown in FIG. 1 it will be seen that the entries for 10/8 and 11/8 cover prefixes 10.1.1.1/32, 10.1.1.2/32, 10.1.1.3/32 and 11.1.1.1/32 and 11.1.1.2/32 respectively, reference numerals 214, 216, 218, 220, 222 (intervening branches are not shown for the purposes of clarity). For each of these prefixes or “leaves” appropriate forwarding information is shown for example in the form of a pointer to an adjacency table entry.
For example 10.1.1.1/32 points to an adjacency table element 224 which indicates the next hop for example by providing an MAC (media access control) address and router interface. In particular the forwarding information includes the next hop along the route to route 118. Similarly the forwarding information for 10.1.1.2/32, 226, also includes the next hop for route 118, and the forwarding information for 11.1.1.1/32 and 11.1.1.2/32 stored at database element 228, 230 is the next hop for route 120.
However in the case of 10.1.1.3/32, the forwarding information stored at database element 232 comprises the next hop along the route to route 120 which has advertised its adjacency to 10.1.1.3 as shown in FIG. 1 and discussed above. It can be seen, therefore, that by longest prefix matching of the destination address, the appropriate forwarding information can be derived.
One known way of performing longest prefix matching is to implement an mtrie, for example of the type described in “Cisco Express Forwarding Overview” which is available at the time of writing on the file cef-ov-final.pdf in the directory warp/public/732/Tech/switching/docs of the domain cisco.com on the World Wide Web. The mtrie includes a plurality of mtrie nodes (or “mnodes”) including a root node and a plurality of child nodes. Each node has a plurality of elements sometimes termed “buckets” containing a pointer to a record such as a forwarding instruction for a prefix represented by that element or to a child node comprising a further set of elements. Where an mtrie comprises a pointer to a record it is termed a leaf. As a result a set of (prefix, record) pairs is provided supporting a lookup of an address such that, if the address has a match in the topology, the record for the longest matching prefix in the topology is returned.
Referring to FIG. 3 which is a diagram of an mtrie structure, the FIB can be structured as a 256-way mtrie structure comprising an “mnode” 300 having up to 256 children representing the 256 possible octets depending from the node. For example mtrie 1, reference numeral 302, in m node 300 points to a child node 304 made up of 256 entries each of which points to a further 256 way child mnode 306 each of which points to yet another 256 entry child mnode 308. Each mnode entry either points to another mnode or to a leaf corresponding to a forwarding information base entry such as an adjacency table element providing the relevant forwarding information. It will be seen that as a result IP addresses comprising four octets can be subjected to a longest prefix match by walking down each mnode in turn and following the pointer from the corresponding mtrie. The four octet representation is termed an “8-8-8-8 stride pattern”. For example reverting to the topology shown in FIG. 1, in order to perform a longest prefix match on either of 11.1.1.1 or 11.1.1.2, the corresponding entry in mnode 300 simply points to an adjacency table entry 310 including the relevant forwarding information, namely the next hop on the route to route 120. The entries for 10.1.1.1, 10.1.1.2 and 10.1.1.3 require a walk through the tenth child, reference numeral 310 of the root mnode 300, the first child off succeeding mnode 312, the first child of succeeding mnode 314 and, respectively, the first and second child off mnode 316 which in turn point commonly to an adjacency table element 318 containing, as forwarding information, the next hop for the route to route 118. However for destination 10.1.1.3 the third child of mnode 316 points to adjacency table entry 320 representing as forwarding information the next hop for route 120.
Typically each adjacency table entry comprises a record containing various forwarding instructions in addition to identification of the next hop, and common records can be considered as a forwarding equivalence class (FEC). The FEC can be represented as an output chain comprising successive output chain elements (oce) each representing different functions for example derivable from a common function table.
This is shown in FIG. 4 which is a schematic diagram illustrating a possible output chain configuration. The pointer from the final mnode in fact points to the start of the appropriate output chain. In particular two output chains designated 400 and 402 are shown. Output chain 400 comprises as a first oce 404, a choice oce for example dependent on whether the packet is IPv4 or IPv6 (IP version 6). Then, in the IPv4 limb a further oce 406 may be, for example, a load balancing instruction followed by a forwarding oce 408 selected by the load balancing oce for the appropriate next hop. On the IPv6 limb, however, the subsequent oce 410 may be, for example, a forwarding oce to the next hop. Any complexity of output chain is permissible and an alternative possibility is shown at 402 comprising simply a forwarding oce 412 to the next hop.
It can be seen that the oce's can be grouped in FIG. 4 as FECs FEC1, FEC2 allowing a simplified structure whereby any node sharing a common forwarding instruction structure can effectively identify it by listing the appropriate FEC. For example the entries against 11.1.1.1 and 11.1.1.2 may be identical such that the same FEC may be called for either.
A current proposition in network design is the provision of multi topology routing (MTR). Multi-topology routing is described in “M-ISIS: Multi-topology routing in IS-IS” by T. Przygienda et al., which is available at the time of writing on the file “isis-wg-multi-topology-00.txt” in the directory proceedings/01mar/I-D” of the domain “IETF.org” of the world wide web. In multi-topology routing one or more additional topologies is overlaid on a base or default topology and different classes of data are assigned to different topologies and classified accordingly during the forwarding operation. For example the base topology will be the entire network and an additional topology will be a subset of the base topology. It will be appreciated that the physical components of the network are common to both topologies but that for various reasons it may be desirable to assign certain classes of traffic to only a certain subset of the entire network as a result of which the multi-topology concept provides a useful approach to providing this functionality.
One example of the use of multiple topologies is where one class of data requires low latency links, for example VoIP data. As a result such data may be sent preferably via physical landlines rather than, for example, high latency links such as satellite links. As a result an additional topology is defined as all low latency links on the network and VoIP data packets are assigned to the additional topology. Another example is security-critical traffic which may be assigned to an additional topology of non-radiative links. Further possible examples are file transfer protocol (FTP) or SMTP (simple mail transfer protocol) traffic which can be assigned to an additional topology comprising high latency links, Internet Protocol version 4 (IPv4) versus Internet Protocol version 6 (IPv6) traffic which may be assigned to different topologies or data to be distinguished by the quality of service (QoS) assigned to it.
Multi-topology routing can be performed in a strict or a preferred (or incremental) mode. In the strict mode a data packet must travel only over the assigned additional topology and otherwise be discarded, for example in the case of security critical traffic. In the incremental mode data packets are preferably sent over the assigned topology but may also pass through the default topology where there is no path using only the assigned topology; thus, the assigned topology is considered preferred, but not strictly required.
Each MTR topology is often referred to by its color for differentiation and an MTR route will always have a base or uncolored topology in addition to zero, one or more configured colored topologies. Inbound packets to an MTR router will have a topology color assigned by a classification engine. As a result in strict mode the forwarding decision is made in the forwarding database of the packet color's topology and dropped if no route is available in the colored topology whereas in incremental mode the forwarding node decision may “fall back” to the base topology, if the forwarding decision cannot be made in the forwarding database of the packet color topology.
FIG. 5 depicts an illustrative network diagram of a multi-topology routing domain based on the topology shown in FIG. 1. Those elements shown in dotted lines form part of the base topology. Elements shown in solid lines also form part of a colored topology, termed here purely for the purposes of convenience and without limitation, a “red” topology. Data to be carried by the relevant topologies is identified by the same color and it will be appreciated that any appropriate nomenclature can be adopted.
It can be seen that router 120 and components 10.1.1.1 and 11.1.1.1 are in the red topology and base topology whereas router 118 and components 10.1.1.2, 10.1.1.3 and 11.1.1.2 are in the base topology only. For example components 10.1.1.1 and 11.1.1.1 may be VoIP phones requiring a higher priority for data traffic in view of its real time requirements. It will further be seen that the component 10.1.1.1 is reachable in the red topology from the route 120.
A complex forwarding database structure is required to support multi topology routing. In particular multiple prefix trees are required one for each topology. Each topology contains a set of prefixes and each prefix will have record comprising a respective FEC. Depending on the network topology and routing configuration prefixes may be present on one or more topologies and the list of FECs may not be the same for each topology. With this structure, MTR strict forwarding is a simple single longest match lookup in the packet color's topology prefix trie. However MTR incremental forwarding may require a second longest match look up in the base topology prefix tree. As a result there is a significant memory and hardware forwarding overhead.
Referring for example to FIGS. 6a and 6b which are schematic representations of prefix trees for the base and red topologies respectively for the network of FIG. 5, the additional storage and look-up requirements can be understood in more detail. The relevant components of the prefix tree in FIG. 2 are shown and numbered similarly in FIGS. 6a and 6b. In the base topology the forwarding information for the various addresses is the same as that shown in FIG. 2, for example 10.1.1.1/32 and 10.1.1.2/32 share as an FEC a forwarding instruction “next hop for router 118” whereas 11.1.1.1/32 and 11.1.1.2/32 share as FEC a forwarding instruction “next hop for router 120”. However in the red topology 10.1.1.1 and 11.1.1.1 have a forwarding instruction “next hop for router 120” and hence may share the same FEC. However 10.1.1.2 and 11.1.1.2 are not configured in the red topology. As a result when a red packet is classified and the red prefix tree shown in FIG. 6b invoked, then no forwarding instruction will be available for packets destined for 10.1.1.2 and 11.1.1.2. As result, in strict forwarding the packet will be flushed. However in incremental mode it will be necessary to perform a second longest prefix match lookup in the base topology prefix tree to allow the packet to proceed to its destination in the base topology. Performing two, successive lookups in this way requires more resources than performing one and thus can make routers supporting MTRs slower or more expensive.