Embodiments of the invention relate to the field of data storage, and in particular, to supporting coordinated access to a clustered file system's shared disk storage subsystem by using dynamic creation of file access layout for different workloads and access patterns.
Data access in cloud computing architectures is beginning to center around scale out storage systems. For example, IBM SONAS™ (Scale Out Network Attached Storage™) is a storage scale out NAS offering designed to manage vast repositories of information in enterprise cloud computing environments requiring very large capacities (e.g., petabytes), high levels of performance, and high availability. IBM SONAS is built using the IBM General Parallel File System™ (GPFS™), a clustered file system high-performance enterprise file management platform that supports scalable and parallel cluster computing. The scale out storage systems thereby allows applications to access a single file system, storage device, single portion or data, or single file through multiple file servers in a cluster.
Third-party file access protocols are commonly used for remote access to file system data (e.g., FTP and HTTP). Most of these protocols are client-server based, with a single client accessing a single storage server. These protocols strangle the scalability of scale-out storage systems, and frequently cause data access bottlenecks, by limiting access to a single server. Coordinated and parallel file access protocols have been developed to simultaneously access multiple file servers in a remote cluster. These protocols help relieve storage bottlenecks, but their access must be coordinated by the storage system to maintain data access semantics (e.g., POSIX) and avoid corruption.
Parallel Network File System (pNFS) is a standardized parallel file access protocol extension of Network File System (NFS) protocol. pNFS is expected to be supported in most scale out storage systems in the future. pNFS, an integral part of NFSv4.1, transforms NFSv4 into a heterogeneous metadata protocol. pNFS clients and servers are responsible for control and file management operations, but delegate I/O functionality to a storage-specific layout driver on the client. pNFS clients can fully saturate the available bandwidth of the parallel file system by separating control and data flows. Each storage system may support pNFS or any similar parallel file access protocols in different ways. For example, to determine a specific file layout for I/O access, each storage system will have unique ways of creating an optimal layout to reduce latency and maximize I/O throughput.