1. Field of the Invention
The present invention relates generally to streaming traffic over a network, and relates more specifically to restoring multicast traffic upon failure through an IP network.
2. Introduction
Distribution of real-time multimedia over an IP backbone has been gaining momentum with content and service providers. However, unlike traditional cable-based broadcast infrastructures that provide “broadcast” analog-based video (such as TV), using an IP backbone for real-time broadcast video distribution imposes stringent requirements for protection and restoration upon a failure. Broadcast TV distribution over an IP network usually adopts multicast forwarding, and is characterized by high bandwidth requirement and tight latency and loss constraints even under failure conditions. The network should be able to restore service rapidly and achieve very high availability.
The current default architecture for the IP backbone consists of routers connected by links with the topological state maintained by an interior gateway protocol (IGP) such as Open Shortest Path First (OSPF). Each link is assigned a suitable weight and each router is configured with multicast capability. A multicast tree is generated such that each multimedia destination (MD) receives only one copy of the IP packet from the server at the multimedia source (MS, or Head End). The MD consists of backbone routers and servers that receive multimedia programs and serve customers in a specific serving area. For router protection, assume that each MD has two backbone routers that connect to each other and connect to multimedia servers. The multimedia server receives all programs from the multimedia source. The backbone routers are connected via bi-directional router links to form a multimedia IP-layer backbone. Then the whole multimedia network can be divided into one backbone and multiple locales.
Some network restoration alternatives include IGP re-convergence only, Link-layer fast reroute, and fast reroute plus hitless multicast tree switching, but each has its drawbacks.
IGP re-convergence relies completely on IGP routing and Protocol-Independent Multicast (PIM) forwarding. Whenever a network failure occurs, link state advertisements (LSA) are broadcast. The IGP re-computes its paths and next-hop routing tables. Upon completion, PIM re-computes its multicast tree. To avoid route flapping and to improve network stability, IGP routing protocols are often designed with various timers calculated to limit the frequency of the shortest path calculation and dissemination of link state advertisements (LSAs). In general, the IGP protocol attempts to propagate information about failures quickly, but waits to propagate less disruptive configuration changes (when links or routers come up). Although those timers may be set aggressively to achieve sub-second converging times, service providers tend to set those timers conservatively for stable network operation in practice. Actual IGP converging times may vary, but transient path outages of 10 seconds or more may not be uncommon. The frequency of outages even from single link or router failures in a long distance IP backbone that solely depends on IGP re-convergence and PIM multicast tree reconfiguration is likely to yield unacceptable quality of service (QoS) for broadcast video applications.
Link-layer fast reroute (FRR) overcomes the QoS impact caused by single IP-layer link failures. For each link in the IP-layer topology, a bundle link is pre-computed that consists of two paths: the primary path with high priority (usually the direct one-hop path between the routers) and a diverse backup path with low priority. The backup path is usually diverse at Layer 1 and at intermediate routers. From the standpoint of the Layer 3 IGP, the bundle link is down if and only if both the primary and backup paths fail. In normal operation, the traffic is forwarded via the primary path. Upon primary path failure, all traffic routed over the primary path is rerouted onto the backup path, provided the backup path is operational during the failure. Traffic remains on the backup path until the failure is repaired and the primary path comes back up.
Some vendors have demonstrated a fast reroute mechanism to switch to the backup path within 50 milliseconds. Since the primary path and its backup path are disjoint at the IP and physical layers, no bundle link will fail from any single fiber or WDM lower layer link failure if hold-down timers are specified appropriately. Thus, for any single link failure, IGP will detect no change to the IP topology and, therefore, no routing changes will occur, nor will there be any changes to the multicast tree. The failure impact will thus be reduced from the ten or more seconds possible under IGP re-convergence time to 50 milliseconds for any single link failure.
One potential problem is traffic overlap during restoration. An advantage of IP multicast is the low cost for forwarding, i.e., often much lower link capacity is required when compared with unicast forwarding. Furthermore, given the large number of streams or channels carried in today's video broadcast networks, significant multiplexing gain can be achieved. Thus, network capacity can be engineered economically and with high link utilization during the normal (no-failure) state. However, during network failures, the use of link back-up paths in the multicast network can potentially cause overlap of streams on the same link, which may lead to congestion. Here traffic overlap means that the same multicast packets travel the same link along the same direction more than once. Even slight performance degradation in video networks can cause loss of service to particular destinations. To resolve this issue, TGP link weights may be tuned such that at least one backup path exists for each link on the IP multicast tree and backup path traffic does not overlap with the multicast tree upon any single link failure.
Another problem with FRR is that a backup path for a given link is typically pre-calculated independent of the backup paths of other links. Since the backup paths are pre-calculated, there is no real-time, dynamic accommodation for different combinations of multiple failures and, consequently, traffic overlap may still occur during such failure states. In long distance networks, multiple physical layer failures are not extremely rare events and have a non-trivial effect on network availability and on the resulting video QoS. From the standpoint of combinatorial optimization, pre-defining backup paths that are failure-state-independent, covering a significant set of multiple failure states, and preventing traffic overlapping is usually infeasible. Since repair times may be on the order of hours, the impact on network availability is non-negligible considering the more stringent QoS requirements from broadcast video.
The third alternative, fast reroute plus hitless multicast tree switching, overcomes the congestion issues in link-layer fast reroute. The basic idea is to apply the fast reroute mechanism but effectively limit its use only to the period during IGP routing and PIM protocol re-convergence. After the primary path of a bundle link fails, the traffic is switched over to the backup path. However, in contrast to link-layer fast reroute, either a link-down LSA is generated or the router costs-out the bundle link (setting the weight to a very high value) whenever an IP link fails such that the new IGP topology and the resulting multicast tree do not use the backup path. Once IGP routing converges, a new PIM tree is rebuilt automatically. This achieves the benefit of rapid restoration from single link failures, yet allows the multicast tree to dynamically adapt to multiple failures. Only during this small, transient period is the network exposed to potential path overlap on the same link along the same direction.
A key component of fast reroute plus hitless multicast tree switching is make-before-break change of the multicast tree switching, i.e., the requirement to switch traffic from the old multicast tree to the new multicast tree with minimal loss of traffic. This technique minimizes the potential traffic “hit” that would be incurred after every single failure when the new tree is generated. The sequence of operations upon a link failure for fast reroute plus hitless multicast tree switching is as follows:                (a) When a primary path failure occurs, invoke the FRR mechanism to reroute the traffic to the backup path, provided the backup is operational.        (b) Send out an LSA with “link-down” or one that advertises a high weight (cost-out) for the associated IP-layer link. During the time of IGP re-convergence, traffic is forwarded along the backup path.        (c) After IGP re-convergence is completed, PIM rebuilds its multicast tree.        (d) Once the multicast tree is rebuilt, implement the make-before-break method, wherein joining of nodes to the tree is sequenced so that downstream nodes are not joined to the new tree until traffic flows to their parent nodes.        
After the failures are repaired, LSAs are advertised to either announce the links back up or their weights are reset to their original values, depending on the option chosen in step 4. This normalization phase is another motivation for the “hitless” method, to minimize downtime after a failure is repaired.
Fast reroute plus hitless multicast tree switching increases service availability significantly. However, it still has some obvious drawbacks. It increases network operation complexity significantly: the network operator would have to manually configure the tunnels or run an additional protocol (a signaling protocol) to create the tunnels; the method still applies static backup path selection in that a dual failure may take out both a primary link and its backup path; the IGP and PIM protocols have to work coordinately; the software in routers has to implement extra mechanisms, such as “make-before-break”. Accordingly, what is needed in the art is an efficient and simple way to avoid traffic overlap and create dynamic backup path when streaming media over IP networks even in the event of one or more network failures.