At present, under the double driving of bandwidth requirements caused by rapid increase of Internet Protocol (IP) services and innovative bandwidth utilization modes induced by the wavelength division multiplexing technology, IP service requires the realization of dynamic allocation of network bandwidth due to its emergency and uncertainty. It is difficult for a conventional static optical transmission network to satisfy the requirements of dynamic allocation, and an intelligent optical network thus emerges. The intelligent optical network directly introduces an intelligent control technology based on IP into an optical network, thereby efficiently supporting dynamic establishment and removal of a connection, reasonably allocating network resources on demand based on traffic engineering, and providing good network protection/recovery performance.
The intelligent optical network introduces a generalized multi-protocol label switching (GMPLS) control plane, thereby endowing a failed network with strong surviving capability, realizing dynamic application, release, and reconfiguration of bandwidth, simplifying network management, and providing new value-added services. The most serious challenges are stability and security problems when an IP-based GMPLS signaling protocol is applied in telecommunication and optical transmission network. In order to protect, to the utmost extent, services from being interrupted, any failure occurred in the control plane should not influence and interrupt services that have been established by a transmission plane. In practical application, no matter whether the control plane has one or several continuously failed control nodes, the network must be good at isolating and recovering the failures of the control plane. After one or several continuous control nodes fail and are recovered, the services established before failures may be recovered normally in a signaling state regarding the control nodes.
Regarding communication failure processing of nodes, the Request for Comments 3473 of the Internet Engineering Task Force (IETF RFC) defines a Resource Reservation Protocol-Traffic Engineering (RSVP-TE) to perform recovering operation on the restart of the node in the control plane.
FIG. 1 is a flow chart of an existing method for processing node restart. As shown in FIG. 1, the method includes the following steps.
In Step 101, a node A cannot receive a HELLO message from a node B when the node B powers off.
The nodes A and B are two nodes in a GMPLS control plane. The nodes A and B may inform each other of the operating state of the control plane software by sending a HELLO message to each other when they are both in a normal state, and refresh control state information in the two nodes by periodically sending a refresh message to each other. The node B which powers off cannot send the HELLO message to the node A.
In Step 102, the node A starts a Restart_Timer and performs a self-refresh.
When a label switched path (LSP) passing through both the nodes A and B exists, node A starts its own Restart_Timer after determining that itself cannot receive the HELLO message from the node B. After that, the node A stops periodically sending the refresh message corresponding to the LSP to the node B, but realizes self-refresh process through keeping LSP-related control state information. In other words, although not receiving the periodic refresh message from the adjacent node B, the node A still keeps the control state information corresponding to the LSP in a counting period of the Restart_Timer. The node A deletes the unrefreshed LSP if not receiving the refresh message from the adjacent node after the Restart_Timer times out.
More specifically, every node on a normally operating LSP receives a Path message from an upstream node and a RESV message from a downstream node. Every node establishes a path state block (PSB) and a reservation state block (RSB) with regard to the LSP, for preserving the control state information carried in the Path message and the RESV message, respectively, such as label values, bandwidth values, and LSP routing information. A node, according to the information in its own PSB, sends the Path message to its downstream adjacent node, and, according to the information in the RSB, sends the RESV message to its upstream adjacent node. Since the node which powers off can neither send the RESV message to its upstream adjacent node, nor send the Path message to its downstream adjacent node, the RSB in the upstream adjacent node cannot be refreshed periodically, i.e., the node A performs the self-refresh process on its own RSB relating to the node B.
In Step 103, the node A continually sends the HELLO message to the node B, and requests the node B to reply.
In Steps 104-105, the node B powers up and restarts, starts a Recovery_Timer, and sends the HELLO message to the node A to indicate that the node B has restarted.
The Recovery_Timer in the node B has the following functions. The node B requires its adjacent node to finish the recovery of the control state information of the LSPs passing through the nodes B and A before timeout of the Recovery_Timer. After the timeout of the Recovery_Timer, the node B deletes the LSPs which are not recovered.
The HELLO message usually includes cells such as a src-instance and a dst-instance. The src-instance has a uniform constant filled therein, and the uniform constant is a constant of the node that sends the HELLO message and runs in a normally operating state. The constant may be preserved in the situation that a node powers off, and the preserved power-off value plus 1 after the node restarts. The dst-instance has a src-instance value filled therein, the src-instance value is included in the latest HELLO message received from the opposite node. If the HELLO message from the opposite node has not been received, or the restarted node sends the HELLO message for the first time, the value in the cell is 0.
After the node powers off and restarts, the src-instance value in the HELLO message from the node is the value, before restart, plus 1, and the dst-instance value is 0. When the node itself operates normally, but a communication link between the nodes is, the src-instance value in the HELLO message sent by the node is equal to the value before the communication link breaks down, and the dst-instance value is 0. The node receiving the HELLO message, according to various combinations of the src-instance value and the dst-instance value carried in the message, determines whether the adjacent node restarts or merely communication link breakdown occurs.
In Steps 106-107, the node A stops the Restart_Timer, stops self-refreshing the control state information relating to the LSPs passing through the nodes A and B, and sends a Path message having a Recovery_label to the node B.
After the node A, according to the HELLO message from the node B, determines that the node B restarts, the node A begins recovery operation on the LSPs passing through the nodes A and B through the Path messages, according to the control state information in its own PSB. Each LSP corresponds to one Path message.
In Steps 108-109, the node B returns an ACK message replying the Path message to the node A, which indicates that the node B has received the Path message from the node A. Furthermore, the node B performs the recovery operation on the control state information relating to the LSP, according to the received Path message.
Since the control state information in the PSB has lost as the node B powers off, the node B creates a corresponding PSB after receiving the Path message from the node A, and records the control state information in the upstream direction which contained in the Path message into the PSB. In addition, if the LSP passing through the node B still has a downstream node, the node B sends the Path message to the downstream node, and the downstream node also sends a RESV message to the node B. After receiving the RESV message, the node B creates a corresponding RSB, and records the control state information in the RESV message. The establishment in the PSB and RSB in the node B represents the end of recovery of the corresponding LSP.
In Step 110, the Recovery_Timer times out, and the node B deletes the LSPs that are not recovered.
After the Recovery_Timer in the node B stops counting, if the node B still has LSPs having control state information not recovered, the LSPs will be deleted.
So far, the flow of processing node restart ends.
The aforementioned flow aims at single node restart. When one LSP has several continuous restarted nodes, the LSP will be deleted if the aforementioned flow is adopted.
Specifically, it is assumed that a LSP passing through the nodes A, B, C, and D in sequence is provided, the nodes B and C power off, the upstream node B restarts firstly, and the downstream node C needs a long time to restart. After the node B restarts, the node A, according to the HELLO message from the node B, determines that the node B restarts, and sends a Path message to the node B. The node B, according to the Path message from the node A, recovers a corresponding PSB. However, since the node C has not restarted yet, the node B cannot receive a RESV message from the node C and cannot recover the RSB, thereby failing to send the RESV message to the node A to refresh the RSB in the node A. As a result, the node A has stopped self-refresh process after receiving the HELLO message from the node B again. If the node A has not received the RESV refresh message from the node B for a long time, the node A deletes the RSB corresponding to the node A. Then, the node A sends a Path_Tear message to the node B to notify the node B to delete the local PSB, thereby causing the deletion of the LSP corresponding to the node B.
If the nodes B and C on the LSP passing through the nodes A, B, C, and D in sequence power off, the downstream node C restarts firstly, and the upstream node B needs a long time to restart, the node C, after restarting, cannot receive the Path message from the node B. Therefore, the PSB on the node C cannot be recovered. The node D stops the self-refresh process on the PSB after receiving the HELLO message from the node C, and then the PSB on the node D is deleted because the node D cannot receive the Path message from the node C for a long time to cause timeout. Furthermore, after the timer corresponding to the RSB of the node D times out, the node D will send a message Resv_Tear to the node C to notify the node C to delete the local RSB, thereby causing the deletion of the LSP corresponding to the node C.
Therefore, the existing method for processing node restart in the prior art cannot recover the LSP reliably when several continuous nodes on the LSP suffer communication failure, as a result, failures of the control plane influence services of the transmission plane.