The present invention generally relates to distributed file systems, and more particularly to management of file namespace in a distributed file system.
Distributed file systems are generally employed for storage of large quantities of data and to reduce input/output (I/O) bottlenecks where there are many requests made for file access. In a distributed file system, the file data is spread across multiple data processing systems. File system control and management of file system meta-data is distributed in varying degrees in different systems.
A desirable characteristic of many distributed file systems is scalability. Scalability is a characteristic that refers to the ease with which a distributed file system can be expanded to accommodate increased data access needs or increased storage needs. For example, as additional users are granted access to the distributed file system, new storage servers may be introduced, and the requests of the additional users may be further spread across the old servers and new servers. The scalability of any distributed file system is limited or enhanced by the system design. Scaling a distributed file system is complicated by the fact that the architecture of the distributed file system may possess inherent bottlenecks that limit the extent to which the system can benefit from additional computation and storage capacity.
The namespace service of a distributed file system provides client applications with location information for the various files in the file system. The location information includes, for example, a server identifier, a storage element identifier, and a storage address. Since a distributed file system generally supports multiple client applications, the namespace service includes logic to maintain coherency and consistency of the namespace data. The coherency and consistency logic may present barriers to the scalability of a distributed file system.
A system and method that address the aforementioned problems, as well as other related problems, are therefore desirable.
Namespace service in a distributed file system using a database management system is provided in various embodiments of the invention. In one embodiment, a namespace database is configured on a namespace server with namespace identifiers and associated file location information. The namespace server is separate from the data servers in the distributed file system. A client proxy arrangement interfaces with client applications and with the namespace server to obtain from the namespace server location information associated with files referenced in file access requests and submit storage access requests to the appropriate data servers. The separate namespace servers and data servers enhances scalability of the distributed file system.
It will be appreciated that various other embodiments are set forth in the Detailed Description and Claims which follow.