List of Related Applications
The following applications are related to the present disclosure and are incorporated by reference herein.
Application entitled xe2x80x9cData Mining Aggregator Architecture with Intelligent Selector (Attorney Docket No. BHUBP001), filed on even date by inventor Roy P. D""Souza; application entitled xe2x80x9cData Mining with Dynamic Events (Attorney Docket No. BHUBP002), filed on even date by inventor Roy P. D""Souza; and application entitled xe2x80x9cData Mining with Decoupled Policy From Business Application (Attorney Docket No. BHUBP003), filed on even date by inventor Roy P. D""Souza.
The present invention relates to an improved computer architecture. More particularly, the present invention relates to techniques for improving the reliability and response time of a scalable computer system of the type employed in e-commerce applications through the Internet.
E-commerce, or electronic commerce through the Internet, places stringent requirements on the planning and implementation of the computer infrastructure that supports the service. As the e-commerce service is in its infancy, it is important for economic reasons to minimize the cost of the computing infrastructure employed to service the few initial users or early adopters. As the use of the service becomes wide-spread among many users, which in the e-commerce age could be in a matter of days or weeks, the initial computing infrastructure must grow correspondingly to offer reliable and fast service to users or risk losing users to competing services.
To facilitate scaling of computing capabilities to meet a potentially explosive growing demand while minimizing upfront costs, many scalable architectures have been proposed. In one approach, the processing load is borne by a single centrally located computer and as the processing load increases, that computer may be upgraded to have a more powerful processor or, in the case with parallel processors, be endowed with additional processors to handle a higher processing load.
However, there are limits to the level of processing power that can be provided by a single machine. This is typically due to limitations in the processing capability of the single processor or in the upper limit on the number of parallel processors that can be provisioned in the computer. Further, limitations in memory access, bus speed, I/O speed and/or the like also tend to place an upper limit on the ultimate processing capability this approach can offer. Even if the ultimate upper limit is not reached, there are economic disincentives to adopting this approach for e-commerce usage due to the fact that marginal increases in computing power for these high-end machines tend to come at great financial cost. For example, a two-fold increase in processing power of such a computer typically requires substantially more than a two-fold increase in cost.
Clustering represents another computer architecture that readily scales to adapt to changing processing loads. In clustering, multiple inexpensive and/or low power computers are clustered together to service the processing load. Typically, the individual computers are interconnected using some type of network connection, such as Ethernet. Each time a machine is connected to the cluster, it publishes its presence to the cluster to signal its ability to share the processing load. Thus, as the processing load increases or decreases, the number of computers in the cluster may be correspondingly increased or decreased to meet the need of the changing processing load.
To facilitate discussion, FIG. 1 illustrates a prior art computer architecture wherein the computers are clustered in various stages to service the processing needs of the stages. With reference to FIG. 1, there is shown a computer system 102, representing a typical prior art clustered computer system employed to service Internet-based transaction requests. Computer system 102, which is typically connected to a larger network such as the Internet or a portion thereof, includes a webserver stage 104, application server stage 106, and a data repository stage 108. As can be seen in FIG. 1, each stage is implemented by a group or cluster of servers.
In general, a user may access computer system 102 by typing in a URL (Uniform Resource Locator) and obtaining a page from a webserver of webserver stage 104. In the typical situation, the first few pages returned may include general introductory information as well as an authentication facility to allow the user to sign in. Once the user is properly authenticated (by entering user name and password, for example), a menu of contents and/or applications may then be served up to the user. If the user chooses an application, the request is serviced by one of the application servers in application server stage 106, which acts in concert with one or more databases in the data repository stage 108, to respond to the user""s request.
Due to the use of clustering technology, however, many other intervening steps occur in between. Beginning with the user""s access request 110 (by, for example, typing in the URL at the user""s web browser), the request is forwarded to a webserver router 112, which arbitrates among the webservers 114(a)-114(e), to decide which of these webserver should service this user""s request. As a threshold determination, webserver router 112 may ascertain whether the user had recently accessed the service through a particular webserver of webserver stage 104. If he did, there is usually data pertaining to this user that is cached at the webserver that last serviced him, and it may be more efficient to continue assigning this user to the webserver that serviced him earlier.
On the other hand, if it is determined that this user has not recently accessed the service or if there is no cached data pertaining to this user on any of the webservers, webserver router 112 may assign the user to one of webservers 114(a)-114(e). The decision of which webserver to assign is typically made based on the current load levels on the respective webservers, the information pertaining to which is periodically received by webserver router 112 from the webservers through path 116. Once the user is assigned one of the webservers, subsequent traffic may be directly transmitted between the user""s terminal and the assigned webserver without going through the router.
After authentication, if the user subsequently indicates that he wishes to employ a particular application, the webserver assigned to him then accesses another router, which is shown in FIG. 1 as application server router 118. Like webserver router 112, application server 118 picks among application servers 120(a)-120(d) of application server stage 106 based on the current load levels on the application servers. The information pertaining to the current load levels on the application servers are periodically received by application server router 118 through path 122 as shown. At any rate, one of application servers 120(a)-120(d) will be assigned to the user to service the user""s request. As in the case with the webservers, once the user is assigned one of the application servers, subsequent traffic may be directly transmitted between the web server that services the user and the assigned application server without going through the router that performed the assignment.
If the application employed by the user requires data from data repository stage 108, the application server may consult yet another router (shown in FIG. 1 as database router 130), which may pick the most suitable database server 132(a)-132(c) for serving up the data. Again, data base router 130 has information pertaining to the level of load on each database server since it periodically receives feedback from the database servers (via path 134).
Since the processing load at each stage is shared by multiple computers or servers, scalability is achieved. Further, the overall cost is kept low since the system employs multiple low power computers to achieve a high processing capacity, and only brings new computers to the cluster if needed.
Although the computer cluster architecture of prior art FIG. 1 solves many problems associated with scaling, it is recognized that there are areas where improvements are needed. By way of example, improved reliability is one area where continuous improvement is desired. In the context of highly demanding applications such as e-commerce, it is important that the computer system that services the user""s transaction requests operates without interruption at all times. This is because the Internet is a global network, and at any time, transaction requests may be sent by users and need to be serviced. It is also recognized that one of the more vulnerable times for computer system failure occurs during or shortly after software upgrades, i.e., when the version of the software programs running on the servers (such as those running on application servers 120a-120d) are changed or when new software packages are loaded.
In the prior art, software upgrades are typically performed on a system-wide basis, using a new software package that is believed to be compatible with the computer system being upgraded. To minimize any impact on service, the upgrade operation typically occurs at a time when usage is relatively low. During a software upgrade operation, the whole computer system is typically taken offline momentarily, the new software is then loaded onto the servers, and the whole computer system is then quickly brought back into service to allow the new software to handle the incoming transaction requests.
If the new software to be loaded had been tested extensively in advance for quality and compatibility, one can expect that the majority of the software upgrade operations could be accomplished with only minor and temporary inconvenience to the users. For some software upgrade operations, however, catastrophic crashes could and did occur. The catastrophic system-wide failures can occur despite the best quality assurance testing since modem software programs are complicated constructs, and their behavior when exposed for the first time to a computer and/or network that had other software, plug-ins, drivers, and the like already installed is not always predictable. In a critical application such as e-commerce, the consequence of such a system-wide failure can be extremely serious as it may result in lost sales, erode user""s confidence, and may lead to the loss of customers to competitors. With regard to maintaining reliability during and after software upgrades, an improved approach is clearly needed.
Even in day-to-day operation, reliability is a big concern since users in the e-commerce age expect continuous uninterrupted service and will not hesitate to switch to competing services if their expectation is not met. One way to improve reliability is to employ dedicated software/hardware to watch over the entire computer system in order to ensure that there exists a sufficiently high level of fault tolerance so that if there is failure in one of the servers, there remains adequate processing power to provide an acceptable level of service to customers, e.g., by handling their requests in an uninterrupted manner and without unduly long delays. If the fault tolerance level falls below some acceptable level in a cluster, the fault tolerance mechanism will alert the operator to permit the operator to bring the fault tolerance back up, e.g., by adding additional servers to the cluster. This situation typically occurs after one of the servers in the cluster fails and the number of redundant servers remaining is unacceptably low.
In prior art, fault tolerance is achieved at the server level, i.e., by maintaining a sufficiently large number of servers per cluster to ensure that if there is a failure in one of the servers, there still remains sufficient processing capability in the surviving servers to allow the computer system as a whole to continue handling the transaction requests. Furthermore, prior art fault tolerance solutions are typically offered on homogeneous clusters and are specifically tied to specific computers from specific vendors. With reference to FIG. 1, for example, the prior art technique of fault tolerance typically requires that all servers in a cluster (i.e., all servers serviced by a router such as servers 120a-120d of FIG. 1) be homogeneous.
There are, however, disadvantages to the prior art approach to implementing fault tolerance. For many businesses, it is sometimes more efficient to employ pre-existing software programs and modules in servicing their customers"" needs than to develop their own software programs. Furthermore, it is sometimes more efficient to aggregate different software modules from different vendors to offer a complete package of service to customers than to employ software modules from a single vendor since different vendors may offer different advantages. By picking and choosing among the modules offered by competing vendors, a business may be able to gain competitive advantages by offering a better aggregate service to their customers.
In these cases, the software modules that are employed, as well as the hardware platforms on which they are implemented, are often highly diverse. Since prior art techniques of fault tolerance requires homogeneity of hardware in a cluster, the diverse mix of software and hardware of such businesses renders it difficult to implement fault tolerance. One possible solution is to implement a homogeneous cluster for each software module so that fault tolerance can be achieved with respect to that software module (e.g., by providing multiple redundant servers per software module). This solution is, however, practical only when the number of different sets of software modules employed is relative small. If the number of different sets of modules employed is fairly large, the solution becomes extremely costly as there needs to be one cluster per set of software modules to implement the prior art technique of fault tolerance.
Another area that system engineers always strive to improve relates to reducing transaction request processing time. Because of scaling and the desire to implement fault tolerance, it is typically the case that there exist multiple copies of any given application program per cluster. With reference to FIG. 1, for example, there typically exist multiple copies of an application program, distributed among two or more of servers 120a-120d. Because there are multiple copies present in the cluster to service incoming transaction requests, it is important to appropriately distribute the processing requirements of the multiple users across the servers so that transaction requests may be more efficiently serviced, with no single server being overtaxed while others are idle.
If all servers of a cluster are homogeneous, the decision regarding which server in the cluster should service a new user can be made by simply examining the relative load levels among the servers that have the appropriate software to handle the incoming transaction request of that user, and by assigning the new user to the server that is least heavily loaded. By distributing the users among various servers according to the relative load levels experienced by the servers, the average processing time for transaction requests is, in theory, minimized. In fact, most modem routers have the capability to receive relative load level data for the servers they service, and can make decisions pertaining to user routing based on the relative load level data.
However, it has been found that when the servers of a cluster are heterogeneous and differ in their processing capabilities, such simple routing strategies sometimes do not provide users with the best possible processing time. This is because a more powerful server may appear slightly more heavily loaded yet may be able to process incoming transaction requests more rapidly than a less powerful server in the cluster that happens to be more lightly loaded. Yet, a simple routing strategy based on relative load levels among servers would have picked the more lightly loaded (and less powerful) server, with a concomitantly longer processing time for transaction requests that are so routed.
Further, there may exist reasons for keeping a particular server relatively lightly loaded (e.g., due to the fact that the lightly loaded server is being stress-tested and not yet certified to handle a full load, or due to the fact that the lightly loaded server also implements another application program, which is of the type that is subject to sudden, rapidly fluctuating processing demands and therefore needs a large reserve processing capacity). For the heterogeneous cluster situation and other preferential routing situations, the prior art method of routing incoming transaction requests leaves a lot to be desired.
Other areas for improvement also exist in the prior art cluster architecture. By way of example, in a typical clustered computer system, some of the servers thereon may be underutilized while other servers are overloaded despite efforts to equitably distribute transaction requests among the servers of the cluster. This is due to the fact that not every server in the cluster may be provided with the same set of application programs. Accordingly, while some servers are severely stressed, other servers, which do not have thereon the application programs that are in heavy demand, may sit idle.
In the prior art, whenever the load level on a particular server of the cluster is unacceptably high, the relative load level information among the cluster triggers an alert. To reduce the load level, the response is typically to add additional servers to the cluster to increase the number of copies of the application program that is in heavy demand, thereby increasing the ability of the computer system as a whole to handle transaction requests that require the attention of that application program.
As can be appreciated by those skilled in the art, the addition of a server to a cluster is typically an expensive option and usually involves a substantial delay and investment in time since it requires the acquisition, installation, and configuration of new hardware in the existing cluster. Unfortunately, while the new server is acquired and/or installed, user responsiveness suffers as the overloaded servers struggle to keep up with incoming transaction requests. Moreover, such an approach to handling temporary increases in traffic makes inefficient use of the existing server processing resource of the cluster because at the same time that the new servers are added to handle the increased demand that is experienced by some servers of the cluster, other servers of the cluster may sit relatively idle. If this approach is taken, the number of servers required to handle peak demand for every application program implemented in the cluster may be disproportionately large relative to the average processing requirement placed on the cluster. This is because demands on different application programs may fluctuate at different times, and an application program that may be idle at one point in time may be heavily used at other times, and vice versa.
Up to now, the discussion has revolved around reactive approaches (i.e., after-the-fact approaches) to ensuring that there is always sufficient processing capability to handle the transaction requests in an appropriate manner. In many cases, a reactive approach may not be sufficient to ensure that service disruption and/or delays associated with transaction request processing will be kept within acceptable parameters. By way of example, by the time it is discovered that a particular server is overloaded, it may be too late to begin the process of adding another server to share the processing load. This is because, as mentioned earlier, such a process is typically time-consuming and thus it may be some time before additional processing resources become available to the cluster. During that time, the servers that implement the software program in demand may be overloaded one-by-one and that overload may lead to a situation wherein none of the users"" transaction requests are serviced in a timely manner. Thus, there are desired proactive approaches to load balancing that can ready the cluster for handling the increased processing load before it occurs.
In some area of the world, outside influences, such as natural and manmade disasters, may pose a serious threat to the reliability of the e-commerce service. By way of example, some regions of the United States are exposed to seasonal storms or to earthquakes. As such it is sometimes desirable to implement the servers in each of the stages of the clustered computer system in different geographic locations. As one example, the application server stage 106 of FIG. 1 may be implemented by two clusters of servers, with one being located in San Francisco while the other is located in New York. When such remote implementation is employed, the presence of the redundant servers further complicates the earlier mentioned challenges regarding maintaining reliability during and after software upgrades, efficient routing of transaction requests, maintaining an acceptable fault tolerance level in a heterogeneous cluster, and handling increases in the number of transaction requests both reactively and prospectively.
In view of the foregoing, there are desired novel and improved computer architectures and techniques for increasing the reliability and reducing the response time of a clustered computer system.
The invention relates, in one embodiment, to techniques for maintaining an adequate level of fault tolerance for a software program implemented on computers of a cluster in a clustered computer system. In one embodiment, the invention includes the use of intelligent director agents that are coupled to the computers of the cluster and the computer-specific as well as the software module-specific information stored therein. These pieces of information permit the intelligent detection of a deficiency in fault tolerance, the intelligent selection of a computer capable of running another copy of the software program, the loading of another copy of the software program on the identified computer, and the registration of the identified computer for servicing transaction requests pertaining to the software program after another copy of the software program is loaded thereon. Techniques are described to permit both computers in a local cluster and computers in a remote cluster to serve the fault tolerance relief role.
In one embodiment, the invention relates to a method for maintaining a predefined acceptable fault tolerance level for a plurality of software modules implementing a software program running on a first plurality of computers coupled together in a cluster configuration in a first cluster in a clustered computer system. The first plurality of computers being coupled to a first intelligent director agent. The method includes tracking, using the first intelligent director agent, status of the software modules running on the first plurality of computers. The method also includes ascertaining a fault tolerance level associated with the software program, with the ascertaining being ascertained by examining the status of the software modules running on the first plurality of computers. If the fault tolerance level is below the predefined acceptable fault tolerance level, the method also includes searching for a first suitable computer among the first plurality of computers to load another module of the software program thereon. The first suitable computer represents a computer of the first plurality of computers that does not have a module of the software program running thereon. The first suitable computer is compatible to execute the another copy of the computer program. If the first suitable computer is available, the method further includes loading the another module of the software program on the first suitable computer, registering the first suitable computer as a computer capable of servicing transaction requests pertaining to the software program after the another module of the software program is loaded onto the first suitable computer, and routing the transaction requests pertaining to the software program to the first suitable computer after the registering.
These and other advantages of the present invention will become apparent upon reading the following detailed descriptions and studying the various figures of the drawings.