1. Technical Field
The present disclosure generally relates to data storage management and in particular to data storage management within distributed storage systems.
2. Description of the Related Art
Large scale storage systems (also referred to as “Big Data”) are currently facing a number of critical challenges. These challenges include (a) an unchecked growth in data volumes leading to storage cost overruns, (b) the immaturity and complexity of Big Data platforms, and (c) the need to quickly and efficiently obtain insights from all of the stored data. Storage costs are increasing for companies engaging in Big Data Analytics initiatives. Even though the cost of storage hardware has been declining each year, these cost declines do not keep pace with the rate of data growth. There are several approaches currently being used to tackle this storage space problem. For example, some companies choose to store all of their data on low-cost tape. Other companies choose an advanced data compression technique to make sure more data can be stored within less space. Still, other companies choose to remove or “prune” the old data and keep only the newer and more relevant data in order to manage space. However, these companies must address challenges associated with the storage of documents across distributed storage systems and/or search clusters in which documents are stored on different search nodes and/or processing and storage systems.