1. Field of the Invention
The present invention generally relates to a transparant non-disruptable Asynchronous Transfer Mode (ATM) network, and more specifically to a method for making real time intelligent decisions for handling non-availability of links and nodes in a packet-based network without disrupting associated end-users.
2. Related Art
Non-availability of links and nodes (hereafter called resources) in a packet-based network such as an ATM network can occur due to several reasons including resource congestion, resource failure and mobility of the resource. The non-availability of resources is a particularly acute problem when dealing with multimedia.
Multimedia service provisioning requires two important aspects to be addressed: end-to-end Quality of Service (QoS), and seamless transport of information across heterogeneous networks. Traffic characteristics of multimedia services vary dynamically and maintaining QoS assurance with a high probability is extremely critical for global service provisioning. It is also important to recognize that end-user information must be delivered across multiple heterogeneous networks, often with different ownerships. The third important issue is multi-casting capabilities of the network.
Transporting multimedia services offers specific challenges for the networked environment because the very nature of multimedia does not allow for gaps in the content provided, i.e. neither lost or missing packets, nor transmission gaps are acceptable by the end users. Successful delivery of multimedia content requires user defined end-to-end QoS. However, QoS is impacted by network behavior of both the local and remote Local Area Networks (LANs) and ATM Wide Area Networks (WANs). Additionally, the multi-casting requirement adds complexity. A global network must provide the infrastructure that delivers end-to-end QoS with multi-casting by taking real-time "local" actions in different parts of the network to curb any abnormal behavior. These actions must be transparent to the end-user.
From a user perspective QoS for multimedia services can be specified in terms of bandwidth, delay, and error rates. Bandwidth is fixed for certain types of applications and varies with time in some others. Delay requirements are specified for each application based on user-to-user interaction with a typical value and an upper bound. Acceptable number of errors are specified probabilistically. Network service provisioning requires that these QoS values be allocated for the local premise network, WAN, and the remote premise network. Once allocated, each network segment must guarantee the QoS with a high probability of assurance under normal network conditions. Local network segments can usually guarantee QoS with a high probability of assurance since control and service provisioning for such segments are more homogenous. That is not the case for WANs since it is accessed by several heterogeneous local segments.
Resource failures are common in operational networks. These are typically handled in telephone networks by disconnecting the calls passing through a failed area; and in data networks by pausing the data transfers until alternate paths are found. These mechanisms increase the delay in service provisioning and thereby impact the QoS parameters. Multimedia service provisioning with multi-casting requires the handling of failures in a localized fashion. The self-healing architecture must provide real-time solutions that helps maintain QoS within specified limits even under adverse conditions, and with end-user transparency.
Physical link failures have been handled in fiber-based networks (e.g. FDDI networks) by enabling real-time loop back at the interface where the failure has occurred. In FDDI networks, two unidirectional rings are combined into one ring for continued information transport. Similarly, if there is a node failure, the rings can be combined into one. This is an adequate solution when information is transported asynchronously based on packets for data services, where the traffic at the input to the network does not dynamically vary. The case of multimedia services mandates managing varying bandwidth demands dynamically.
FDDI is a LAN backbone technology that has a fall back interface built-in-either through a dual ring or through dual homing of concentrators. Tolerance to single failures is a system design feature. Dual counter-rotating rings wrap around in the event of failure to provide reliable connectivity. It takes a second failure (if the first one is not fixed), to segment the ring. Dual attached trees of concentrators can support a large number of single-attach end stations with an increased level of reliability. However, FDDI requires that stations and concentrators on the backbone work properly. If optical bypass switches are not used, one station down is all that the system can tolerate. In the event of recoverable failures, end-system applications do not need to take any explicit connection establishment action since it is a packet switched network. If there is a failure, the network effectively becomes one ring and the usable bandwidth is cut in half.
Another way of handling network resource problems during the delivery of multimedia is to rely on the server to provide continuous data by running multiple copies of the transmission and by retransmittting one or more copies if a resource problem develops. This can lead to the unsynchronized transmission of numerous copies of the data by the server. Accordingly, this is a very cumbersome and expensive solution.
Avoiding disproportionate loss of service from a single point of failure is a serious challenge in ATM networks. In most current implementations, a cut or breakage in a virtual circuit or path requires the end station to establish another path.
Previous efforts at providing self-healing networks and/or networks having the ability to take alternate paths because of resource problems have not been successful in providing solutions to all of the problems associated with multimedia transmissions. These previous efforts include the following:
Vatunone, U.S. Pat. No. 5,621,721, discloses a communication network with a database consistency mechanism. A sequence number and a set of routing information for each of a set of virtual circuits in the communication network are maintained in the main database and an auxiliary database in each set of communication nodes in the communication network. A new sequence number is assigned to a virtual circuit each time the virtual circuit is routed. The sequence of numbers in the main database or the auxiliary database in each of the communication nodes are internally and externally verified if one of the communication nodes switches between the main database and the auxiliary database. Virtual circuits are rerouted through the communication network when necessary.
Russ, et al., U.S. Pat. No. 5,623,481, discloses a system for verification of an alternate route found subsequent to a restored process based on the self healing network restoration of a telecommunications network due to failure or disruption in the network. To determine whether a link has been restored by means of an alternate route, a path verification method and system is utilized to provide a continuity check. The Operations Support System of the network retrieves messages from the end nodes and compares the previous stored path verification message for each of the end nodes, to determine whether the communications path is continuous and valid.
Takano, et al., U.S. Pat. No. 5,600,630, discloses an ATM path changing system and method for use which can set an alternating route in the event of a failure occurring in a transmission line or in a virtual path. This is accomplished by providing a header converter, a plurality of routing tables, a register to set failure internal routing information, a comparator for comparing the contents of the register with the internal routing information of a system, and a selector for selecting the contents of one of the first and second routing tables.
Nederlof, U.S. Pat. No. 5,590,118, discloses a method for rerouting a data stream comprising the steps of detecting a failure between switching nodes, transmitting from one switching node a request with first and second address fields of the first and second switching nodes, and each switching node retransmitting the signal until an alternative route for the data stream is found.
Foglar, U.S. Pat. No. 5,559,959, discloses a method for transmitting message cells via redundant virtual path pairs of an ATM communication network. Each cell has an internal cell header for each of the paths of a path pair. The network has a plurality of multi-stage switching networks, whereby the message cells can be transmitted via a virtual path pair duplicated by a switching network at the beginning of the respective pair path. Based on the header parts, the associate cell message is forwarded or is duplicated and then forwarded now resulting in two message cells.
Matthews, U.S. Pat. No. 5,521,910, discloses a method for determining a best path from a source node to a destination node. The method includes a first recursive search in parallel which is initiated at the source node and proceeds outwardly to discover neighboring nodes and calculates traversal paths until reaching the destination node.
Ohara, U.S. Pat. No. 5,495,472, discloses a method and apparatus for allowing cross connections to be set up in network elements for automatic healing of a signal path by rerouting the signal on a failed working path to a protection path.
Niestegge, et al., U.S. Pat. No. 5,490,138, discloses an ATM communication system comprising a plurality of concentrator equipment units connected to an ATM communication equipment unit in a ring line system. Each of the concentrator equipment units is connected via a separate virtual path to the corresponding ATM equipment unit and to the remaining concentrator unit, to form a plurality of virtual connections.
Kakuma et al., U.S. Pat. No. 5,488,606 discloses a procedure for switching-over systems for use in a duplexed ATM exchange operating in its asynchronous transfer mode. An ATM exchange is electronic hardware for repeating and exchanging the cells in an ATM network. The ATM exchange has an internal switching unit to process the routing of the cells. The hardware of the internal switching unit processes the routing control by referring to the tag attached to the head end of each ATM cell. Since ATM exchanges are often configured as an active (master) system and a parallel backup (slave) system, a cell transmission timing difference arises between the two systems. Sometimes, during the operation of the exchange, this cell transmission timing difference causes the loss of a cell or the needless duplication of a cell. Rather than synchronize the master and slave systems and destroy the merit of the ATM network, it is preferable to duplex the exchange's switching units to improve reliability. The procedure for switching-over the parallel systems involves assigning a master or slave system indicating mark to each cell upon detection and transmitting the cells to a respective output highway based on the mark. Cells having a mark designating the master system are stored after detecting at least one cell having a mark which corresponds to a system switch-over.
Kondo et al., U.S. Pat. No. 5,475,675 discloses an apparatus and method for non-stop switching in which a transmission line for transmitting statistically multiplexed cells which can be switched from a current transmission line to a spare transmission line without causing momentary interruption. The apparatus comprises a current statistical multiplexer for producing a first sequence of information cells along a first transmission line and a spare statistical multiplexer for producing a second or re-channeled sequence of information cells along a second transmission line. The apparatus also includes a means for detecting empty information cells in the first and second sequences and a means for using the detected empty cells as a trigger to measure the phase shift between channels. Once the phase shift is determined, the information cells subsequent to the first sequence are re-routed to the second transmission line in accordance with the timing requirements of the new path. In this way, no cells or parts of cells are lost or needlessly duplicated.
Takatori et al., U.S. Pat. No. 5,473,598 discloses a cell routing method and apparatus comprising two or more pathway routing tables formed in accordance with data received from address filters of an ATM switch to store routing information for indicating the destination of a cell output. Also provided are a plurality of conversion tables formed from the data provided by Virtual Path Identifier conversion circuits. An input interface circuit determines which routing table to pair with which conversion table. This information is passed to a switch circuit which effects the re-routing. This reference does not discuss optimizing bandwidth.
Miyagi et al., U.S. Pat. No. 5,461,607 discloses an ATM communication apparatus and a failure detection and notification circuit comprising an ATM exchange, a plurality of transmission lines and a management section at the line/connection end point for each transmission line. The management section outputs a channel failure signal upon detection of a connection failure. The channel failure signal is transmitted to a failure detection and notification circuit through a signal line separate from the transmission line for transmitting the information cells. An Alarm Indication Signal (AIS) generation circuit extracts the AIS cell, determines the correct failure state, generates a far-end-receive-failure cell which corresponds to the failure state and inserts the far-end-receive-cell into the stream of the information cells. There is discussion of how re-routing is done, how pathways are created, and how bandwidth is utilized.
Chujo et al., U.S. Pat. No. 5,412,376 discloses a method for structuring an ATM network in which information cells are transferred between a pair of nodes that are connected by a working route and a plurality of alternate routes that include intermediate nodes. When a failure occurs in the working route, the second node detects the failure and transmits an alarm to the first node. The first node, upon receipt of a the alarm, transmits a switching command cell to switch the first Virtual Path Indicator (VPI) conversion table to the second VPI conversion table. The second VPI conversion table is programmed in advance to correspond to every failure pattern, and converts the path of the input cell in accordance with the data stored within the switched second VPI conversion table and sets up an alternate virtual path by which the information cells may be transmitted. Does not discuss the utilization of bandwidth.
Hemmandy et al., U.S. Pat. No. 5,398,236 discloses an inter-node communications link failure recovery system for ATM nodes in which connections are quickly switched from a faulty link to one or more existing links. The Network Management System, which includes a CPU, controls the overall operation of the system via program commands to the nodes. When a fault detector at each node detects a failure in a link between a pair of nodes, an alarm signal is sent to the Network Management System. Within a node, alternate connection routes are predetermined using header translator tables for every connection origination at or terminating to a circuit connected to a link of interest. Because each interface card has a incoming header translator table and an outgoing header translator table, an ATM switch may serve to route incoming cells to an outgoing link in accordance with the routing information contained in the cell header.
Weissmann et al., U.S. Pat. No. 5,333,130 discloses a drop and insert multiplexer network that is self healing in case of a break or failure. The network comprises two end stations, known as field nodes, connected by two lines and a chain of intermediate time divisible multiplexer stations. Also a central node is provided. In the case of a failure within the network, the field node which detected the failure receive a message from the central node. Upon receiving the message, the field nodes and the interconnecting links do not form a new connection at the aggregate level but instead the channel interfaces change the direction of the connection as necessary to recover the full operation of the network.
Nardin et al., U.S. Pat. No. 5,317,562 discloses a method and apparatus for initially routing and rerouting a multiplicity of connections to a slave node based upon a search which determines and selects the best route with regard to available bandwidth, loading considerations and maximum allowable delays.
Uriu et al., U.S. Pat. No. 5,301,184 discloses a control system for switching between the active system and the standby system of a duplicated structure in an ATM exchange. The structure of the switch units are identical and each contains its own self routing module. Each self routing module has the switching function of directing each ATM cell to one of a plurality of outputs depending upon the route indication information that is defined for each of the switching stages.
Spencer et al., U.S. Pat. No. 5,278,977 discloses a distributed, multi-node system for communicating financial transactions between a plurality of distributed and dispersed point of service terminals and one or more central computers. Inherent testing and correction delays are overcome by providing means for initiating and running loop tests. The system analyzes the loop test to see if there is in fact a failure. Should a failure be detected, the system reconfigures itself based on pre-programmed information to work around the failure.
Omuro et al., U.S. Pat. No. 5,241,534 discloses a change-back system for a multi node ATM network in which a rerouting path is set to replace an original path when a fault is generated in the original path with the network. The change-back system uses three separate and task dedicated circuits to respectively detect a fault, reroute the virtual path, and change the path from the rerouting path to the original path after the fault has been corrected. To effect a rerouting without losing cells, the cell reception times of all cells are recorded within the header of the cell. Using this information, the third circuit calculates "guard time." The guard time is the difference between the first and second reception times and is used to delay the transmission of a cell from the alternate path to the original path.
Sakauchi, U.S. Pat. No. 5,239,537 discloses a broadband integrated services digital network system comprising a plurality of switching nodes that are interconnected by transmission lines having communication links and service links. Each switching node comprises an ATM self-routing network for routing cells from the inputs and outputs of network transmission links according to a virtual path identifier contained in the cells. A virtual path memory is provided for storing data indicating link-to-link connections associated with normal virtual paths and data indicating link-to-link connections associated with alternate virtual paths.
Nishimura et al., U.S. Pat. No. 5,235,599 discloses a self healing network with distributed failure restoration capability. This network comprises a first and second node and a plurality of intermediate nodes therebetween. In response to a failure in a channel or transmission line which terminates at the first node, the first node transmits as many specialized control packets as there are adjacent transmission lines. Once received, the third nodes broadcast copies of the received control packet to each adjacent node. Shortly thereafter, the second node transmits a specialized return packet to a given third node. In response, the given third node determines whether there is a spare channel or transmission line to an adjacent node located on a route leading to the first node. If such an adjacent node is found, the third node transmits a positive acknowledgment and transmits the received return packet to the adjacent node. Upon receiving this packet, the adjacent node becomes part of the alternate virtual path by which the information is rerouted around the failure.
Howes, U.S. Pat. No. 5,113,398 discloses a self-healing data network and network node controller. Data transmission of data cells form a message permitting, self-clocking operation of each node. Elastic buffering is implemented to allow receipt of messages without regard to the timing considerations of the phase that is created by the asynchronous operation of each node relative to other nodes. The nodes of this network are able to independently detect faults regardless of the operating status of the other, surrounding nodes.
Fite, Jr., U.S. Pat. No. 5,016,243 discloses an arrangement of transmission facilities forming virtual circuits for transmitting packets of information in a network. Faults are detected in transmission paths associated with a network node and an individual fault indication message is generated for each network facility that has at least one virtual circuit affected by the fault. To effect automatic fault recovery, any node receiving a fault indication message determines which virtual circuits identified in the message are terminated in the node and which virtual paths pass through the node. The virtual circuits which are terminated in the node are switched to alternate virtual circuits.
None of these previous efforts, taken either alone or in combination, teach or suggest all of the elements of the present invention. Particularly, none of the previous efforts take preventative action by optimizing bandwidth, establishing primary and secondary paths and monitoring the connections through management actions to prevent resource failure from disrupting data transmission. Rather, these previous efforts react to resource failures by taking remedial actions after data transmission has been disrupted by resource failures.