A data center is a facility that houses computing systems for a particular business, industry, governmental entity, or other organization. Such computing systems may include, for example, one or more server farms that perform various functions for the organization. Examples of such functions include hosting web sites, storing information, and providing processing for computing applications, among others. Other computing systems may be housed in a data center for performing other functions.
Security of information and application processing associated with a data center may be critical to particular organizations. Various efforts have been made to enhance the security of data centers. For example, some data centers are provided with physical security such as housing the data center in an inconspicuous location, providing restricted access to the data center, providing the data center with environmental isolation and control, and providing electrical power supply redundancy to the data center. Another element of security that has been added to data center design is to provide an organization with more than one physical data center, e.g., providing multiple data centers at different locations.
Providing “redundant” or “backup” data centers may provide an organization with the ability to protect data center functionality against harmful factors that extend beyond the scope of the organization's control over a single data center. For example, a single data center may be vulnerable to physical failure, e.g., from terrorist activity, fire, earthquake, etc. A single data center may be vulnerable to electronic failure, e.g., “hacker” activity such as viruses, broadcast storms, denial of service attacks, and the like. A single data center may be vulnerable to electric and/or telecommunications failure of such a magnitude that provided systems internal to the data center are unable to mitigate the failure. Other failures reducing or eliminating the functionality of a single data center are possible. In such instances, having additional data centers at separate geographic locations may provide the organization with the ability to maintain data center functionality after the loss of a single data center.
An organization may desire to provide “always-on” service from data centers such that a client using the functionality of the data centers perceives continuous service during a failover from one data center to another and during simultaneous operation of multiple active data centers. Some methods have been proposed to provide such “always-on” service to clients connecting via the Internet. For example, U.S. patent application Ser. Nos. 11/065,871 “Disaster Recovery for Active-Standby Data Center Using Route Health and BGP”, Ser. No. 11/066,955 “Application Based Active-Active Data Center Network Using Route Health Injection and IGP”, and Ser. No. 11/067,037 “Active-Active Data Center Using RHI, BGP, and IGP Anycast for Disaster Recovery and Load Distribution” all to Naseh et al., describe the use of border gateway protocol (BGP) and advertisement of a block of IP addresses, e.g., 24.24.24.0/24, on a subnet basis for the respective data centers.
The above mentioned efforts to enhance the security of data centers may themselves create issues. For example, a networking issue for organizations that maintain multiple active data centers is session persistence. If route maps change during a client session, e.g., due to changes in network usage causing changes in a shortest network path, traffic from one client for one session may be routed to more than one data center. For example, if two active data centers advertise the same block of IP addresses, a client may generally be routed via the shortest topographic path, using one of a number of routing metrics, to one of the data centers. However, the “shortest” path may change during the pendency of the session, e.g., as network traffic at various points throughout the network changes. In some circumstances, such changes could cause a route to a different data center to become “shorter” than the route initially taken by client traffic. This can be particularly problematic for lengthy client sessions, e.g., sessions associated with financial transactions performed over a network.
Route convergence is an example of a networking issue for organizations that maintain an active data center with a passive backup data center that may become active upon failover. When a network topology changes, e.g., due to a failure, some routers on the network may receive updated network information and use the updated information to recomputed routes and/or rebuild routing tables. On a large-scale network, e.g., the Internet, route convergence can take a significant amount of time with respect to the duration of some client sessions, possibly allowing a client to become aware of a network problem, e.g., by receiving a failure dialog on a network interface. A client may store domain name system (DNS) records locally, e.g., a cache of IP addresses corresponding to websites. Such DNS records may come with a particular time to live (TTL) that, if not expired, may prevent such DNS records from being refreshed, which may slow the route convergence process and/or allow the client to receive a failure dialog on a network interface.