1. Field of the Invention
The present invention relates to computer networks and, more particularly, to efficiently managing bandwidth (BW) for multipoint-to-multipoint (MP2MP) services in a provider network of a computer network.
2. Background Information
Many organizations, including businesses, governments and educational institutions, utilize computer networks so that employees and others may share and exchange information and/or resources. A computer network typically comprises a plurality of entities interconnected by means of one or more communications media. An entity may consist of any device, such as a computer, that “sources” (i.e., transmits) or “sinks” (i.e., receives) data frames over the communications media. A common type of computer network is a local area network (“LAN”) which typically refers to a privately owned network within a single building or campus. LANs typically employ a data communication protocol (LAN standard), such as Ethernet, or a wireless protocol, that defines the functions performed by data link and physical layers of a communications architecture (i.e., a protocol stack).
One or more intermediate network devices are often used to couple LANs together and allow the corresponding entities to exchange information. For example, a bridge may be used to provide a “switching” function between two or more LANs or end stations. Typically, the bridge is a computer and includes a plurality of ports that are coupled via LANs either to other bridges, or to end stations such as routers or host computers. Ports used to couple bridges to each other are generally referred to as a trunk ports, whereas ports used to couple bridges to end stations are generally referred to as access ports. The bridging function includes receiving data from a sending entity at a source port and transferring that data to at least one destination port for forwarding to one or more receiving entities.
Spanning Tree Algorithm
Most computer networks include redundant communications paths so that a failure of any given link does not isolate any portion of the network. Such networks are typically referred to as meshed or partially meshed networks. The existence of redundant links, however, may cause the formation of circuitous paths or “loops” within the network. Loops are highly undesirable because data frames may traverse the loops indefinitely.
Furthermore, some devices, such as bridges or switches, replicate frames whose destination is not known resulting in a proliferation of data frames along loops. The resulting traffic can overwhelm the network. Other intermediate devices, such as routers, that operate at higher layers within the protocol stack, such as the Internetwork Layer of the Transmission Control Protocol/Internet Protocol (“TCP/IP”) reference model, deliver data frames and learn the addresses of entities on the network differently than most bridges or switches, such that routers are generally not susceptible to sustained looping problems.
To avoid the formation of loops, most bridges and switches execute a spanning tree protocol which allows them to calculate an active network topology that is loop-free (i.e., a tree) and yet connects every pair of LANs within the network (i.e., the tree is spanning). The IEEE promulgated a standard (IEEE Std. 802.1D-1998™) that defines a spanning tree protocol to be executed by 802.1 D compatible devices. In general, by executing the 802.1D spanning tree protocol, bridges elect a single bridge within the bridged network to be the “Root Bridge”. The 802.1D standard takes advantage of the fact that each bridge has a unique numerical identifier (bridge ID) by specifying that the Root Bridge is the bridge with the lowest bridge ID. In addition, for each LAN coupled to any bridge, exactly one port (the “Designated Port”) on one bridge (the “Designated Bridge”) is elected. The Designated Bridge is typically the one closest to the Root Bridge. All ports on the Root Bridge are Designated Ports, and the Root Bridge is the Designated Bridge on all the LANs to which it has ports.
Each non-Root Bridge also selects one port from among its non-Designated Ports (its “Root Port”) which gives the lowest cost path to the Root Bridge. The Root Ports and Designated Ports are selected for inclusion in the active topology and are placed in a forwarding state so that data frames may be forwarded to and from these ports and thus onto the LANs interconnecting the bridges and end stations of the network. Ports not included within the active topology are placed in a blocking state. When a port is in the blocking state, data frames will not be forwarded to or received from the port. A network administrator may also exclude a port from the spanning tree by placing it in a disabled state.
To obtain the information necessary to run the spanning tree protocol, bridges exchange special messages called configuration bridge protocol data unit (BPDU) messages or simply BPDUs. BPDUs carry information, such as assumed root and lowest root path cost, used in computing the active topology. More specifically, upon start-up, each bridge initially assumes itself to be the Root Bridge and transmits BPDUs accordingly. Upon receipt of a BPDU from a neighboring device, its contents are examined and compared with similar information (e.g., assumed root and lowest root path cost) stored by the receiving bridge in memory. If the information from the received BPDU is “better” than the stored information, the bridge adopts the better information and uses it in the BPDUs that it sends (adding the cost associated with the receiving port to the root path cost) from its ports, other than the port on which the “better” information was received. Although BPDUs are not forwarded by bridges, the identifier of the Root Bridge is eventually propagated to and adopted by all bridges as described above, allowing them to select their Root Port and any Designated Port(s).
In order to adapt the active topology to changes and failures, the Root Bridge periodically (e.g., every hello time) transmits BPDUs. In response to receiving BPDUs on their Root Ports, bridges transmit their own BPDUs from their Designated Ports, if any. Thus, BPDUs are periodically propagated throughout the bridged network, confirming the active topology. As BPDU information is updated and/or timed-out and the active topology is re-calculated, ports may transition from the blocking state to the forwarding state and vice versa. That is, as a result of new BPDU information, a previously blocked port may learn that it should be in the forwarding state (e.g., it is now the Root Port or a Designated Port).
Virtual Local Area Networks
A computer network may also be segmented into a series of logical networks. For example, U.S. Pat. No. 5,394,402, issued Feb. 28, 1995 to Ross (the “'402 Patent”), discloses an arrangement for associating any port of a switch with any particular network segment. Specifically, according to the '402 patent, any number of physical ports of a particular switch may be associated with any number of groups within the switch by using a virtual local area network (VLAN) arrangement that virtually associates the port with a particular VLAN designation. More specifically, the switch or hub associates VLAN designations with its ports and further associates those VLAN designations with messages transmitted from any of the ports to which the VLAN designation has been assigned.
The VLAN designation for each port is stored in a memory portion of the switch such that every time a message is received on a given access port the VLAN designation for that port is associated with the message. Association is accomplished by a flow processing element which looks up the VLAN designation in the memory portion based on the particular access port at which the message was received. In many cases, it may be desirable to interconnect a plurality of these switches in order to extend the VLAN associations of ports in the network. Those entities having the same VLAN designation function as if they are all part of the same LAN. VLAN-configured bridges are specifically configured to prevent message exchanges between parts of the network having different VLAN designations in order to preserve the boundaries of each VLAN. Nonetheless, intermediate network devices operating above L2, such as routers, can relay messages between different VLAN segments.
In addition to the '402 patent, the IEEE promulgated the 802.1Q specification standard for Virtual Bridged Local Area Networks. To preserve VLAN associations of messages transported across trunks or links in VLAN-aware networks, both Ross and the IEEE Std. 802.1Q-2005 specification standard disclose appending a VLAN identifier (VID) field to the corresponding frames. In addition, U.S. Pat. No. 5,742,604 to Edsall et al. (the “'604 patent”), which is commonly owned with the present application, discloses an Interswitch Link (ISL) encapsulation mechanism for efficiently transporting packets or frames, including VLAN-modified frames, between switches while maintaining the VLAN association of the frames. In particular, an ISL link, which may utilize the Fast Ethernet standard, connects ISL interface circuitry disposed at each switch. The transmitting ISL circuitry encapsulates the frame being transported within an ISL header and ISL error detection information, while the ISL receiving circuitry strips off this information and recovers the original frame.
Multiple Spanning Tree Protocol
Within the IEEE Std. 802.1Q-2005, the IEEE also included a specification standard for a Spanning Tree Protocol that is specifically designed for use with networks that support VLANs. The Multiple Spanning Tree Protocol (MSTP), which is described in the IEEE Std. 802.1Q-2005, organizes a bridged network into regions. Within each region, MSTP establishes an Internal Spanning Tree (IST) which provides connectivity to all bridges within the respective region and to the ISTs established within other regions. The IST established within each MSTP Region also provides connectivity to the one Common Spanning Tree (CST) established outside of the MSTP regions by IEEE Std. 802.1Q-2005 compatible bridges running STP or RSTP. The IST of a given MST Region receives and sends BPDUs to the CST. Accordingly, all bridges of the bridged network are connected by a single Common and Internal Spanning Tree (CIST). From the point of view of the legacy or IEEE Std. 802.1Q-2005 bridges, moreover, each MST Region appears as a single virtual bridge on the CST.
Within each MST Region, the MSTP compatible bridges establish a plurality of active topologies, each of which is called a Multiple Spanning Tree Instance (MSTI). The MSTP bridges also assign or map each VLAN to one and only one of the MSTIs. Because VLANs may be assigned to different MSTIs, frames associated with different VLANs can take different paths through an MSTP Region. The bridges may, but typically do not, compute a separate topology for every single VLAN, thereby conserving processor and memory resources. Each MSTI is basically a simple RSTP instance that exists only inside the respective Region, and the MSTIs do not interact outside of the Region.
MSTP, like the other spanning tree protocols, uses BPDUs to establish the ISTs and MSTIs as well as to define the boundaries of the different MSTP Regions. The bridges do not send separate BPDUs for each MSTI. Instead, every MSTP BPDU carries the information needed to compute the active topology for all of the MSTIs defined within the respective Region. Each MSTI, moreover, has a corresponding Identifier (ID) and the MSTI IDs are encoded into the bridge IDs. That is, each bridge has a unique ID, as described above, and this ID is made up of a fixed portion and a settable portion. With MSTP, the settable portion of a bridge's ID is further organized to include both a settable priority component and a system ID extension. The system ID extension corresponds to the CIST or one of the MSTI IDs. The MSTP compatible bridges within a given Region will thus have a different bridge ID for the CIST and each MSTI. For a given MSTI, the bridge having the lowest bridge ID for that instance is elected the root. Thus, an MSTP compatible bridge may be the root for one MSTI but not another within a given MSTP Region.
Each bridge running MSTP also has a single MST Configuration Identifier (ID) that consists of three attributes: an alphanumeric configuration name, a revision level and a VLAN mapping table that associates each of the potential 4096 VLANs to a corresponding MSTI. Each bridge, moreover loads its MST Configuration ID into the BPDUs sourced by the bridge. Because bridges only need to know whether or not they are in the same MST Region, they do not propagate the actual VLAN to MSTI tables in their BPDUs. Instead, the MST BPDUs carry only a digest of the VLAN to MSTI table or mappings. The digest is generated by applying the well-known MD-5 algorithm to the VLAN to MSTI table. When a bridge receives an MST BPDU, it extracts the MST Configuration ID contained therein, including the digest, and compares it with its own MST Configuration ID to determine whether it is in the same MST Region as the bridge that sent the MST BPDU. If the two MST Configuration IDs are the same, then the two bridges are in the same MST Region. If, however, the two MST Configuration IDs have at least one non-matching attribute, i.e., either different configuration names, different revision levels and/or different computed digests, then the bridge that received the BPDU concludes that it is in a different MST Region than the bridge that sourced the BPDU. A port of an MST bridge, moreover, is considered to be at the boundary of an MST Region if the Designated Bridge is in a different MST Region or if the port receives legacy BPDUs.
Registration Protocols
IEEE Std. 802.1p (now incorporated within IEEE 802.1D-2004) outlines the implementation of the Generic Attribute Registration Protocol (GARP) and related GARP applications which allow end stations and bridges to exchange membership information in a generic manner. In particular, GARP, as defined by IEEE 802.1p, “provides a generic attribute dissemination capability that is used by participants in GARP Applications (GARP Participants) to register and de-register attribute values with other GARP Participants within a Bridged LAN.” One application of GARP defined in IEEE 802.1p is the GARP Multicast Registration Protocol (GMRP), which allows GARP participants to join and leave multicast MAC (Media Access Control) address groups. The participant (e.g., an end station) who wishes to join a particular group registers with another GARP participant (e.g., a bridge) that is accepting registrations. This GARP participant (bridge) then applies for membership on behalf of the original participant (end station), which is propagated throughout the network. The information propagated by GMRP generally comprises the multicast MAC address. Another GARP application defined in IEEE 802.1p is the GARP VLAN Registration Protocol (GVRP). GVRP allows a participant to join and leave particular VLANs in a similar manner as GMRP, but involving VLAN membership information, e.g., VLAN IDs (VIDs), as defined in IEEE 802.1Q.
Generally, a GARP participant is responsible for handling GARP state machines and BPDU distribution. A participant in a multiport device (e.g., bridge/switch) that receives a registration for a particular attribute on a port declares (advertises) the attribute through the applicants on all of the other ports participating in GARP. The mechanism for propagating this information from one GARP participant to another within the same device is called GARP Information Propagation (GIP). A GIP context refers to the group of GARP participants belonging to a GIP. For each GIP context, there exists one GARP participant for each GARP application that is enabled on that port (e.g., one participant for each VLAN on that port in GMRP, and one participant for each port in GVRP). Each GARP participant may have both application-specific behavior and the GARP Information Declaration (GID) component, which may comprise, inter alia, one or more attribute values. An attribute is the application-specific information that is being propagated by GARP; e.g., a group MAC addresses and service requirements for GMRP, VIDs for GVRP, etc.
Notably, in addition to the GARP application protocols, IEEE 802.1p also explains how to utilize a tagging scheme to allow frames to be tagged with priority information and an optional VID. The prioritization operates at the MAC layer of the traffic, and classifies (groups) traffic into separate traffic classes. Eight classes are defined by IEEE 802.1p, which are to be configured manually by network administrators (the IEEE has made broad recommendations), and registered throughout the network. Illustratively, the highest priority is seven, which, for example, may be assigned to network-critical traffic, such as Routing Information Protocol (RIP) and Open Shortest Path First (OSPF) updates. Values five and six may be used for delay-sensitive applications such as interactive video and voice, while data classes four through one range from controlled-load applications such as streaming multimedia and business-critical traffic down to “loss eligible” traffic. The zero value is used as a best-effort default, which may be invoked automatically when no other value has been set.
A new IEEE project, P802.1ak (Draft 5.1), identifies the Multiple Registration Protocol (MRP) standard for use with registrations (officially entitled the “Standard for Local and Metropolitan Area Networks Virtual Bridged Local Area Networks—Amendment 07: Multiple Registration Protocol”). MRP, an update (or replacement) to GARP, allows participants in an MRP Application to register attributes with other participants in a bridged LAN. A Multiple VLAN Registration Protocol (MVRP) is defined within IEEE P802.1ak to communicate topology changes for each VLAN independently of the spanning tree supporting the VLAN (e.g., an update to GVRP). This allows multiple VLANs to use a single spanning tree without requiring a bridge to relearn addresses for a given VLAN when a topology change does not change the bridge ports used to reach end is stations receiving frames for that VLAN, as will also be understood by those skilled in the art. A Multiple Multicast Registration Protocol (MMRP) is also defined that updates GMRP in a similar manner. Those skilled in the art will understand that the MRP update allows for reduced fault recovery time (convergence time) and reduced disruption of traffic in a very large network due to a topology change in a small portion of that network.
Multipoint-to-Multipoint Service Bandwidth Considerations
Customers (users) often desire to send traffic across a provider network (e.g., a bridged network) to other customers. These traffic or data “flows” enter the provider network from a source customer, e.g., at a User-Customer Interface (UNI), and traverse nodes (e.g., bridges) of the provider network to reach the destination customer of the flow, e.g., at a remote UNI. Notably, if one provider network is attached to another provider network, the networks may be attached by Network Node Interfaces (NNIs). These customer-to-customer or “point-to-point” (P2P) transmissions (services) may require the use of a certain amount of bandwidth (BW) to transmit the data. In some instances, it is desirable to guarantee or reserve the BW required for the transmission along the path of the data flow between points (a “conversation”), e.g., according to a particular spanning tree, to ensure that the traffic flowing between the points has enough BW. Otherwise, traffic may be dropped or suspended due to excess traffic along the path, e.g., due to other flows or conversations. The BW required for P2P services is relatively straightforward to define. For instance, committed BW and burst error rates (as will be understood by those skilled in the art) may be defined at each end point of the P2P service, such as by a service level agreement (SLA) between the customer(s) and the provider network(s). Once these BW values are defined, the load at each port within the provider network along the single path (spanning tree) between points will have a maximum value corresponding to the BW values defined for each end point.
“Multipoint-to-Multipoint” (MP2MP) services, on the other hand, are services in which any number of multiple points (e.g., customers) can transmit and receive data flows across the network to/from any number of other multiple points (i.e., more than two UNIs). The difficulty associated with creating and enforcing an MP2MP SLA is that the flow of data on an MP2MP service depends on a mixture of source and destination customers (i.e., MAC addresses) at any given moment in time. Currently, MP2MP SLAs for BW are difficult to define, for example, resulting in SLAs such as a “10 Mb/s 20 UNI service.” Enforcing such an SLA is even more difficult. For instance, the ambiguity of where the 10 Mb/s limits should be applied/enforced may create a number of problems within the network. For example, limiting the total amount of BW for the entire MP2MP service to 10 Mb/s is difficult to enforce without knowing what traffic is being transmitted at all times. Alternatively, each UNI (e.g., of the 20 UNIs) may be limited to transmit 10 Mb/s each. However, this may result in 19 of the 20 UNIs sending a 10 Mb/s data flow to a single UNI, which would then limit the 190 Mb/s of flows to the maximum 10 Mb/s restriction. This may be particularly wasteful of BW on the provider network, for example, where the 19 UNIs are located in a localized location (e.g., New York City), while the single receiving UNI is located in a remote location far across the network (e.g., Los Angeles). The 190 Mb/s of data flows would traverse the entire United States only to have 180 Mb/s removed at the end point.
In addition to surpassing end point limits, nodes (e.g., bridges) within the provider network may not be able to support all of the MP2MP service data flows (conversations) if each data flow is utilizing the maximum amount of BW allowed, e.g., depending upon connectivity internal to the provider network and BW allocation. In order to prevent this situation, the network may police (e.g., mark frames as “red,” “yellow,” and “green”) and enforce (e.g., dropping red frames immediately, and dropping yellow frames before green frames) traffic at certain points (e.g., ports) within the provider network. Those skilled in the art will understand that policing/enforcing of frames may be specific to a certain service, a certain priority level within the service, a certain color (e.g., red/yellow/green), and in a particular direction (e.g., input to the port or output from the port). As used herein, these parameters are signified by a “{service, priority, color, direction}” tuple, as will also be understood by those skilled in the art.
Generally, it is very difficult to determine which ports to police within the network for MP2MP services, and to determine what the BW limits on each port should be. There remains a need, therefore, for a technique that efficiently defines an MP2MP SLA, and efficiently enforces that MP2MP SLA within the network. In particular, there remains a need to “push back” the input and/or output BW limits imposed at the ports implementing an SLA in order to prevent wasting excess BW throughout the interior of the network, i.e., to prevent transmission of BW that will eventually be discarded.