1. Field
Embodiments of the invention relate to the field of computer networking; and more specifically, to upgrading network elements utilizing metadata state translation.
2. Background
A network element in a service provider's network typically handles high volumes of data traffic (“traffic”) from users accessing several different services and/or communicating with other users. For example, a network element can handle services for up to tens of thousands of users. An interruption in the operation of the network element can cause a disruption of service to these tens of thousand of users.
FIG. 1 illustrates one embodiment of a service provider network 100 with a network element forwarding traffic between end stations and content servers through a network. While in one embodiment, network element is a router, in alternate embodiment, network element can be other networking equipment known in the art (switch, hub, firewall, server, etc.). Furthermore, while in one embodiment network element forwards traffic, in alternate embodiment, network element can perform the same and/or different traffic processing (switch traffic, shape traffic, apply access controls, apply firewall policies, provide file storage or database access, serve web pages, etc.) In FIG. 1, network 100 comprises end stations 102A-C, network element 104, core network 106, and content servers 108A-B. End stations 102A-C couple to network element 104 and network element couples to content servers 108A-B through core network 106. While in one embodiment, end station 102A-C are home personal computers, in alternate embodiments, end stations can be the same or different type of machine (e.g., business computer, personal digital assistant, cell phone, game console, handheld game system, laptop computer, etc.). End stations can couple to network element through any one of the means know in the art (e.g., Ethernet, wireless, digital subscriber line (DSL), cable modem, fiber, etc.). Network element 104 provides an entry point into core network 106 by forwarding traffic from end stations 102A-C to content servers 108A-B, from content servers 108A-B to end stations 102A-C, and traffic going between end stations 102A-C. While in one embodiment, network element 104 is an edge network element that forwards traffic between the edge network servicing end stations 102A-C and core network 106, in alternate embodiments, network element 104 can be a network element or switch positioned differently in the service provider's edge and/or core networks.
Core network 106 is the backbone network of the service provider that typically has high capacity to handle that high volume of traffic traveling through network 100. Content servers 108A-B serve content and/or control information for services offered to end stations 102A-C.
As network element 104 handles the traffic for this large number of users, network element 104 accumulates state information that controls the handling of the traffic. While in one embodiment the state information accumulated by network element 104 is a traffic forwarding table, in alternate embodiments, the state accumulated has the same and/or different information (configuration data, user session information, firewall information, access control lists, quality of service information, statistics, etc.). This state is typically run-time information that does not survive a reboot of network element 104.
Periodically, a network element receives a software upgrade to its services. Typically, a software upgrade requires a reboot of the network element so that software upgrade can take effect. However, a reboot disrupts the service and wipes out the built up state, because the state does not survive a reboot. Even though a reboot of a network element can occur quickly, the rebuilding of the state typically takes longer. Rebuilding of the state involves reconnecting subscribers, rebuilding forwarding tables, subscriber session information, etc.
An improved software upgrade method, termed an “in-service” upgrade, is used when the network element has one or more redundant peers. A peer could be another instance of the same type of network element occupying an equivalent position in the network topology or a redundant component of the network element itself. For example, a network element, such as network element 104, that has two or more controller cards can utilize an in-service upgrade. An in-service upgrade involves first installing and initializing the new software on a backup or standby controller, synchronizing the network element state to the backup controller, switching control to the backup controller and then driving the software upgrade and state restoration to the other network element components. In this algorithm, the backup controller becomes the active controller for the network element and the former active controller is the backup controller.
FIG. 2 is a block diagram of a network element 200 that includes redundant controller cards. In FIG. 2, backplane 206 couples to line cards 202A-N and controller cards 204A-B. While in one embodiment, controller cards 204A-B control the processing of the traffic by line cards 202A-N, in alternate embodiments, controller cards 204A-B perform the same and/or different functions (upgrade software, handle operator requests, collect statistics, etc.) Line cards 202A-N process and forward traffic according to the policies received from controller cards 204A-B. In an alternative embodiment, network element 200 can have one controller card or more than two controller cards.
FIG. 3 is a block diagram of an active controller card passing active state information to a backup controller card. An active controller card is the card that controls the functions of a network element. The backup controller card is the card that is a standby card that could take over control of the network element. For example and by way of illustration, the backup controller card would take over controlling the network element in cases of active controller card failure, upgrade of the network element, etc. While in one embodiment, active state information is the state accumulated by network element 104 as described in reference with FIG. 1, in alternate embodiment, active state information is the same and/or different information used to control network element 104. In FIG. 3, active controller card 302A sends active state information 304 to backup controller card 302B. The in-service upgrade algorithm uses the active state information to perform the upgrade of the network element without a disruption of traffic processing.
A drawback of the in-service upgrade is that the upgrade needs to account for the differences between the old and the new software in the expected format and semantic content of the state. The in-service upgrade algorithm requires that the new software contain explicit knowledge of the precise format and the content of the state data synchronized from the controller running the older software or that the state data be transported in a version-independent format such as tag-length-value. However, the values conveyed by a version-independent format are not defined in any fashion that supports the translation process. There would still need to be knowledge embedded in the software images about the relationship between the version-dependent internal format and the version-independent external format. Using the version-independent format for communication requires the active controller to convert the state data from the native form to an intermediate version-independent format form, send the version-independent format to the backup controller, then convert the version-independent format back to the native form. Encoding such knowledge in the software through writing special-purpose software routines is time-consuming and prone to error.