Large scale compute architectures, such as high performance computing (HPC) supercomputers or cloud-based computing systems, typically have a set of compute nodes dedicated to compute functions and a storage system dedicated to storage functions. Almost universally, however, applications executing on the compute nodes can become blocked, and lose valuable compute time, while waiting for the storage system to preserve written data. The bottleneck for a storage system may be attributed, for example, to the computationally intensive tasks of creating parity metadata, such as erasure codes, and other metadata, especially for streamed data, as well as the latency of the storage media itself.
With computational capacities in compute nodes of large scale compute architectures approaching exascale, there are large amounts of computational capacity sitting idle on the compute nodes while the compute nodes wait for the storage system to complete input/output (IO) operations.
A need therefore exists for improved techniques for computing parity metadata, such as erasure codes, using computational capacities of the compute nodes. A further need exists for techniques for precomputing a data layout before the data is sent to the storage system that reorganizes application write data to better match performance characteristics of the storage system and for sending large data, even for multiple small files, in large pre-packaged byte-ranges to avoid subsequent reorganization by the storage system.