Field
Embodiments presented herein generally relate to distributed computing. More specifically, embodiments presented herein provide a high-performance distributed file system that provides concurrent access and transaction safety.
Description of the Related Art
A distributed computing network system includes multiple computer systems which interact with one another to achieve a common goal. The computer systems are networked to form a cluster. Computer systems in the cluster may share different resources, such as data, storage, processing power, and the like.
An example of a distributed computing network system is a secondary storage environment. A cluster of secondary storage systems may provide services for primary storage systems. For instance, secondary storage systems may provide backup, copy, and test and development services for data residing in primary storage. The secondary storage cluster can expose data backed up from the primary storage system to clients systems, which can read or write data to the file system.
A distributed file system needs to support concurrent access to file system objects, e.g., files and directories, while also maintaining a consistent state. When different nodes in the cluster may access the file system concurrently (e.g., in response to read and write requests sent by clients), it is important that the file system remain consistent. That is, updates to the file system performed by one node are visible to other nodes of the cluster. Further, consistency requires tolerance for node failures, such that if a node fails while performing an update to the file system, the incomplete file system operations transactions need to be either completed or aborted.