A distributed store is a storage system in which data is stored on multiple machines (e.g., computers or other processing systems). The multiple machines may include multiple respective nodes among which multiple instances of data may be stored to provide “high availability” of the data. For example, a distributed store may be a distributed cache, a distributed database (e.g., a distributed SQL database), or other suitable type of distributed storage system.
Data operations with respect to data in a distributed store are usually initiated at or initially directed to one instance of the data, which is referred to as the primary instance of the data. Examples of data operations include but are not limited to a read operation, a write operation, an eviction operation, a notification operation, etc. For example, an instance of data to which a read (or write) operation is initially directed with respect to the data is the primary instance of the data with respect to that read (or write) operation. In another example, an instance of data at which an eviction (or notification) operation is initiated with respect to the data is the primary instance of the data with respect to that eviction (or notification) operation. Instances of data with respect to a data operation that are not primary instances with respect to the data operation are referred to as secondary instances with respect to the data operation. Placement of the various instances of data among the nodes of the distributed store can sometimes result in the primary instance of the data and one or more of the secondary instances of the data being included in the same “scale unit” (a.k.a. unit of failure).
A scale unit is an entity in an information technology (IT) infrastructure with respect to which data failures may be determined, upgrades may be performed, latency issues may be addressed, etc. A data failure may be a loss of an instance of data, an inability to access an instance of data, etc. For example, a scale unit traditionally is defined at a machine, pod, or rack boundary by an administrator who manages the infrastructure. A pod is a physical structure on which machines may be stored. A rack is a grouping of pods within a data center, for example. Accordingly, nodes that are included in the same scale unit traditionally are assigned a common scale unit value. Scale units may be defined (and respective values may be assigned) using a configuration file, an automated process, or other suitable technique. Conventional techniques for assigning scale unit values are relatively inflexible, and conventional data storing techniques may provide relatively little protection against loss of access to data even when multiple instances of the data are included in the distributed store. For example, if all instances of the data are included in a single entity within the IT infrastructure, a data failure with respect to that entity may result in loss of access to the data.