It has become increasingly commonplace to use grids of multiple node devices to perform widely varied analyses of large data sets (e.g., what is commonly referred to as “big data”) of widely varied types. Such grids of node devices are often used to speed the performance of an analysis of a large data set by independently processing multiple partitions of the data set in parallel through the parallel execution of identical and/or otherwise related analysis routines.
In performing such storage, it has become commonplace to divide a data set (e.g., a data cube or data “hypercube”) into partitions that are distributed among multiple node devices for storage. Such distributed storage enables distributed access to and/or use of such data in analyses that may be performed at least partially in parallel among the node devices that each store at least one of the partitions.
It has also become commonplace to additionally generate and store, in a similarly distributed manner, one or more copies of each partition among the multiple node devices. Such distributed storage of additional copies can provide a degree of fault tolerance against losing any partition of a large data set if one or more of the node devices experiences a failure, and may enable a choice to be made as to which of more than one node devices is to be used to access and/or perform an analyses with each of the partitions.