A primary goal for many businesses today is the delivery of increased availability of computing devices and their associated function, improved network services and dependable redundancy capabilities in the event of hardware and software failures. To address these goals, software that performs one or more business functions are frequently designed as multi-tier software products, and are often deployed in distributed locations. Multi-tier software products, as discussed herein, include software products that separate essential functionality, functionality crucial for reliable operation of the software product, into separate components, where these components are often operating discretely and often on different hardware. For example, voluminous data access and storage may be delegated to a Relational Database Management System (hereinafter, “RDMS”), which are often better equipped to handle complex data operations.
In addition, with the advent of fast networks and the global Internet, multi-tier software products are often deployed in distributed locations. For example, one solution that achieves the goals of increased availability, improved network services and dependable redundancy capabilities is a cluster of computing devices, or a “server cluster”. FIG. 1 illustrates a geographically dispersed cluster arrangement, also known as geospan clusters or, more simply, geoclusters. Geoclusters may span a distance ranging from between a few hundred to few thousand kilometers. Each server in the cluster is termed a “node.” A geographically dispersed cluster is a cluster that may have the following attributes: multiple storage arrays, at least one deployed at each site; nodes connected to storage in such a way that, in the event of a failure of a site or the communication links between sites, the nodes on a given site can access the storage on that site; and host-based software that provides a way to mirror or replicate data between the sites so that each site has a copy of the data.
In the example of FIG. 1, nodes 110 and 120 are located at first site and connected to each other via a network 150. Nodes 130 and 140 are located at a second site and are likewise locally connected via a network 160. The two sites are geographically dispersed. For example, the first site may be located in the Los Angeles area and the second in the New York area. The nodes and storage of the two sites are further coupled together by an appropriate network, schematically illustrated at reference numeral 170. Typically, the private and public network connections between cluster nodes must appear as a single, non-routed LAN. It is necessary when implementing geoclusters, therefore, to use virtual network technologies (e.g., VLANs) to ensure that all nodes in the cluster appear on the same IP subnets.
Continuing the example of FIG. 1, nodes 110 and 120 are connected to a storage controller array 180, and nodes 130 and 140 are connected to a second storage controller array 190. The storage arrays communicate with each other and present a single view of the disks spanning both arrays. Disks 182, 184, 192, and 194 are thus combined into a single logical device. Individual data stores, such as disks 182 and 192, are mirrored across the cluster, as indicated by cloud 172. Likewise, disks 184 and 194 are mirrored across the cluster, as indicated by cloud 174. The cluster may thus failover between any of the nodes 110, 120, 130 and 140, and any of the data stores 182, 184, 192, and 194. As a consequent of opaqueness in the geographic distribution of the computing devices in the cluster, the cluster illustrated in FIG. 1 is unaware of the geographic distance between its nodes, and is implemented at the network and storage levels within the infrastructure architecture.
To enforce the opaqueness of the geographic distribution of the computing devices and to facilitate communication between the different components of a multi-tier application with distributed resources (i.e., operating within a server cluster), a type of software technology described as “middleware” is frequently used to assist software architects and developers build multi-tier software products. Middleware is computer software often used to connect software components or applications residing on different physical machines. The software consists of a set of enabling services that allow multiple processes running on one or more machines to interact across a network. Middleware may include, but is not limited to, web servers, application servers, and similar tools that support application development and delivery. Middleware is especially integral to modem information technology based on, for example, XML, SOAP, Web services, and service-oriented architecture. The technology evolved to provide for interoperability in support of the move to coherent distributed architectures, which are used most often to support and simplify complex, distributed applications.
Consequently, middleware sits “in the middle” of a multi-tier software product and is between the different components of the software working on different operating systems. It is similar to the middle layer of a multi-tier single system architecture, except that it is stretched across multiple systems or applications. Examples include database systems, telecommunications software, transaction monitors, and messaging-and-queuing software.
Due to the geographic distribution of various computing devices within a cluster, propagation delay, even for signals travelling at the speed of light, can affect the stability of the cluster. Theoretically, a packet travelling at the speed of light in the most direct manner possible between two locations, e.g., Los Angeles and New York (approximately 4000 km) through a single dedicated fibre optic channel, will experience a 13.3 millisecond propagation delay in each direction and a roundtrip propagation of at least 26.6 milliseconds. This theoretical minimum, however, is unachievable due to the presence of multiple switches between such locations, each of which introduces substantial additional delay. In practice, however, typically latencies range from between 4 milliseconds for a 100 kilometers separation to 150 milliseconds for a 3700 kilometers separation in a geocluster implementation.
It is therefore necessary for developers of multi-tier software products, often residing on a server cluster, to insure that the latency of various operations, including input/output (“I/O”) storage operations, is within the bounds required to support applications. In other words, it is necessary to be able to verify that the time to accomplish a certain operation between geographically distant servers, when added to the communication time to propagate a response, do not exceed a given latency threshold, such as 500 milliseconds.
It is further necessary for those who implement multi-tier software products to test a server cluster in a single location before physically deploying the software product, potentially across vast geographical distances. By testing the configuration in a single location prior to dispersing the nodes across different locations, the multi-tier software product team may be able to more efficiently identify and resolve system problems than if such problems were first identified in different (and physically distant) locations. This is because the expertise and resources to identify and resolve such problems can be concentrated in a single location. Once the configuration has been proven to work in a single location, the components of the multi-tier software product may then be separated.
The scenario illustrated in FIG. 1, however, inevitably leads to some degree of latency on network traffic between the two locations, and the amount of latency will affect the performance of the software products. There may also be constraints on the available bandwidth between the different locations, which could affect performance. Few testing departments (and the associated testing personnel) are able to maintain this kind of distributed environment. Instead, they may often conduct their testing on a single, local machine. This can make it difficult to determine how the products will perform with the amount of latency seen by customers.