Parallel storage systems are widely used in many computing environments. Parallel storage systems provide high degrees of concurrency in which many distributed processes within a parallel application simultaneously access a shared file namespace.
Parallel computing techniques are used in many industries and applications for implementing computationally intensive models or simulations. For example, the Department of Energy uses a large number of distributed compute nodes tightly coupled into a supercomputer to model physics experiments. In the oil and gas industry, parallel computing techniques are often used for computing geological models that help predict the location of natural resources.
When a number of parallel processes write data to a shared object, block boundaries, data integrity concerns and serialization of shared resources have prevented fast shared writing. Recent efforts to address this problem have employed log structured virtual parallel file systems, such as a Parallel Log-Structured File System (PLFS). See, e.g., U.S. patent application Ser. No. 13/536,331, filed Jun. 28, 2012, entitled “Storing Files in a Parallel Computing System Using List-Based Index to Identify Replica Files,” incorporated by reference herein. While such techniques have improved the speed of shared writing, they create a secondary challenge to maintain the necessary amount of metadata without creating unnecessary overhead since log structured file systems are known to create more metadata than traditional flat file filesystems.
A need therefore exists for improved techniques for parallel writing of data to a shared object, in order to reduce file system metadata.