In recent years, software engineers have focused on building global-scale Internet applications. These Internet applications often require large-scale distributed systems as back ends. As shown in FIG. 1, large-scale distributed systems provide networked online storage and allow multiple computing devices to store, access, and share data in the online storage. Distributed systems may use a client/server architecture in which one or more central servers store data and provide data access to network clients. Data may be stored in multiple datacenters and in multiple datacenter shards which each contain multiple clusters.
FIG. 2 illustrates a block diagram of an exemplary distributed system 200 for providing data in a large-scale distributed system. The system 300 includes a plurality of user terminals 202 (e.g. 202-1 . . . 202-n), each of which includes one or more applications 204 (e.g. 204-1 . . . 204-n), such as an Internet browser. The user terminals 202 are connected to a server 206 and a plurality of computer clusters 210 (e.g., 210-1 . . . 210-m) through a network 208 such as the Internet, a local area network (LAN), a wide area network (WAN), a wireless network, or a combination of networks. The server 206 may include one or more load balancing engines 212, one or more memory devices 214, and one or more CPUs 216.
Each of the user terminals 202 may be a computer or similar device through which a user can submit requests to and receive results or services from the server 206. Examples of the user terminals 202 include, without limitation, desktop computers, notebook computers, tablets, mobile devices such as mobile phones, smartphones, personal digital assistants, set-top boxes, or any combination of such devices.
In order to support global-scale access to data, distributed systems should have high availability and low latency when providing data to users. Distributed systems may also require service resiliency, which is the ability of a distributed system to quickly react to a series of system failures or disruptions to data access and to smoothly maintain data availability at an acceptable level. In the case of failure or disruption, a system should be able to avoid cascading failure and maintain the availability of a service. A system should also be able to quickly rebalance the service loads across clusters so that the system is prepared for future failures.
Data replication is a widely used conventional technique in distributed computing for better data reliability. Replication entails creating copies (or replicas) of data entities and storing the entity copies in multiple datacenters. By maintaining several copies of the same entity, an application can ensure that its back end can tolerate datacenter failures. If a datacenter is unavailable, replicate entities can be accessed from alternate datacenters.
A conventional coarse-grained replication scheme for better data reliability includes duplicating an entire database in a single replica set, which is usually determined prior to duplication. Coarse-grained replication schemes may handle service resiliency by providing data from replicas in other data clusters when clusters in datacenters fail. However, the conventional coarse-grained replication scheme may not be able to provide fast data access and due to legal requirements, not every data entity can be replicated everywhere.
To address these issues, an alternative replication scheme, referred to as fine-grained replication (FGR) has been used in global-scale applications. In the FGR scheme, an entire database is portioned into a number of sub-databases, each of which has its own replication set. The partition criterion may vary depending on application logic. A common practice is to partition the database by individual service entity. FGR allows flexible replica placement in order to satisfy various performance, load, cost, legal, and other requirements for a single service.
In FGR configurations, there is a serving replica for every data entity stored in the system. A serving replica is the primary replica of a data entity providing service on the data entity in normal operation. A failover replica is the secondary replica of an entity providing failover service in case of primary replica failure. The roles of a data entity's replicas can be changed online with very low operation cost. In a case of cluster failure, entities with their serving replicas in the failed cluster will have the service load shifted to their corresponding failover replicas automatically. Data consistency among replicas of an entity is guaranteed by various distributed system algorithms. Some replicas in the replica set of an entity can temporarily fall behind due to reasons such as a datacenter being unreachable or unresponsive. An FGR-based application can tolerate temporary inconsistency between replicas of an entity
Although FGR provides advantages in flexibility, creating a resilient FGR configuration scheme is extremely challenging because this scheme requires: (1) a prohibitively high computing complexity in solving for replication configuration (including replica placement and service scheduling) to achieve service resiliency and (2) expensive and time-consuming executions of replication configuration solutions involving moving and copying data partitions.