A computer, and in particular a server, may be used to make various types of electronic content available to requesting client devices. While using a single computer to send electronic content to client devices allows for a simple configuration, it also results in disadvantages. For example, a single computer may be limited in the number of client devices it can communicate with at any given time. Furthermore, a hardware or software failure in the computer may result in electronic content being unavailable to client devices. A failure may also result in valuable data being lost or destroyed. Thus, a single computer does not provide the scalability, performance, high-availability, and failure recovery mechanisms often required when providing electronic content to requesting client devices.
In an attempt to address some of these concerns, computers can be linked together to form a cluster. In a cluster environment, the group of computers works closely together and in many respects perform as a single computer but without many of the limitations of a single computer. For example, a cluster can communicate with a larger number of client devices at any given time than a single computer. In addition, because of redundancies between computers in the cluster, a hardware or software failure in a single computer in the cluster may not result in the loss or destruction of data. Even if data is lost or destroyed in the cluster, often times the loss or destruction is minimized. Furthermore, if a single computer fails, other computers in the cluster may be able to provide various types of electronic content to requesting client devices. Thus, a cluster environment may provide scalability, performance, high-availability, and failure recovery mechanisms needed when providing electronic content to requesting client devices.
Clustering computers, however, adds numerous layers of complexity as opposed to configuring a single computer to make various types of electronic content available to requesting client devices. For example, in a cluster environment, there needs to be a mechanism in place that determines which of the computers in the cluster will respond to a particular request from a client device. Furthermore, each of the computers in the cluster needs to have access to up-to-date information so that the correct response to a particular request is provided to the client devices.
Generally, there are two predominate architectures used for clustering computers: shared-disk and shared-nothing. In a shared-disk cluster, an array of disks-usually a Storage Area Network (SAN) or Network Attached Storage (NAS)-stores all of the data associated with the cluster. Each computer, or node, in the cluster has access to all of the data and can request and store data in real-time to and from the SAN or NAS. Because each node can update the database, a shared-disk cluster has a master-master architecture. If one node in the cluster fails, the other nodes can still handle requests from client devices and can communicate with the SAN or NAS. However, if the SAN or NAS fails or if communication between the SAN or NAS and the nodes is severed, then none of the nodes in the cluster may be able to access the data associated with the cluster. Thus, while a shared-disk cluster provides scalability and load-balancing, a shared-disk cluster may have a single point of failure in the SAN or NAS. Furthermore, because each computer in the cluster communicates with the same SAN or NAS, the scalability of the cluster may be limited by the number of requests the SAN or NAS can handle over a given period of time.
The other predominate cluster architecture is a shared-nothing architecture. In a traditional shared-nothing cluster, each node has sole ownership of the data on that node and does not share the data with any other node in the cluster. Data is typically divided across multiple nodes. For example, one node may contain data regarding users, a second node may contain data regarding orders, and a third node may contain regarding products. Thus, data is partitioned across the nodes in the shared-nothing cluster and each of these nodes is called a master node. When a request from a client device is received, a routing table determines which master node in the cluster has the data needed for the request and routes the request to that node. In order to provide scalability and higher reliability, a mirrored copy of a master node is copied to one or more slave nodes. Only the master node in the cluster can update data associated with the master node, but any of the slave nodes associated with the master node can read data.
Traditionally, installing a shared-nothing cluster required manually configuring each of the nodes in the shared-nothing cluster. For example, a node may need to be manually configured in order for the node to recognize that it is a master node. Other nodes associated with a particular master node may need to be manually configured to be slave nodes so that they can read data but cannot write data associated with the master node. If a particular master node fails, then one of the slave nodes that used to be associated with the master node may need to be manually reconfigured so that the node becomes the master node. Until the manual reconfiguration takes place, the ability to write data related to the master node may not be possible.
Existing shared-disk and shared-nothing clusters address some aspects of scalability, performance, high-availability, and failure recovery mechanisms often required when providing electronic content to requesting client devices. However, each of the currently existing clusters has numerous disadvantages and suffers from various deficiencies. Systems and methods that address at least some of these disadvantages and deficiencies are needed.