Service providers (e.g., wireless, cellular, etc.) and device manufacturers are continually challenged to deliver value and convenience to consumers by, for example, providing compelling network services. These services are leading to vast amounts of data (structured and binary) which need to be managed, stored, searched, analyzed, etc. Over the last decade, the internet services have accumulated data in the range of exabytes (1016 bytes). Although most of this data is not structured in nature, however, it must be stored, searched and analyzed appropriately before any real time information can be drawn from it for providing services to the users.
In order to provide high availability of such a huge amount of data, most systems backup their entire dataset regularly and under specific conditions. In such systems, in case of the occurrence of adverse conditions, the whole dataset can be recovered (e.g. restored from the backup). However, providing backup of a total dataset is the most conservative way of ensuring data security and in any case leading to data losses, the system can always recover at least to the last state prior to the loss. However with large amounts of data, backing up exabytes or petabytes of data is no longer feasible from either cost or recovery perspective.