In parallel computing systems, such as High Performance Computing (HPC) applications, data storage systems must deal with the increasing amounts of data to be processed. As HPC environments grow to exascale (and larger) by becoming more distributed, sharded storage arrays comprised of a very large number of storage devices are expected to be employed. In sharded storage arrays, a user stores data on each storage device by first creating horizontally partitioned “shards” on each storage device. In order to parallelize Input/Output (I/O) operations on the sharded storage arrays, it is desirable to have shards on a large number (if not all) of the available storage devices. In addition to the user data, metadata should be distributed across the storage devices as well.
A need therefore exists for improved techniques for storing data and metadata on sharded storage arrays.