The present invention relates generally to the field of data storage, and more particularly to reducing the probability of losing data while in storage.
A storage system is a collection of one or more servers that are interconnected by a network using a variety of connectivity protocols or media. The network connecting the storage system may be flat or hierarchical in design and the collection of servers may be physically ordered in the same computer rack, or distributed between different racks at different locations. Each server may be connected to a single or to multiple storage devices that can be represented by hard disk drives (HDD), solid-state drives (SSD), Flash Card or any other media that can be used for persistent storage of data.
Data reliability is crucial for distributed storage systems. Distributed storage systems typically use replication and erasure coding schemes to increase their resiliency to failures. Replication stores replicas (i.e., copies) of data across different failure domains. Erasure coding divides data into data and parity chunks, and distributes them across different failure domains. Based on the reliability protocols of a distributed storage system, any replica or portion of the erasure code that is unavailable or corrupted may be recovered from the remaining replicas or erasure code. The different failure domains can be defined by different storage devices, servers, racks, and even data centers. In distributed storage system, all the components are connected by a network, and can be any one server can be accessed by any other server.