1. Field of the Invention
This invention relates in general to storage system architectures, and more particularly to a method, apparatus and program storage device that provides a distributed file serving architecture with metadata storage virtualization and data access at the data server connection speed.
2. Description of Related Art
The ever increasing capability of computers in storing and managing information has made them increasingly indispensable to modern businesses. The popularity of these machines has lead in turn to the widespread sharing and communication of data such as electronic mail and documents over one or more computer networks, including local area networks, wide area networks such as the Internet and wireless networks.
The computer operating system is a large, complex piece of software which manages hardware and software resources of the computer processing system. On the other hand, storage management software is used in the organization of storage devices, such as disks, into logical groupings to achieve various performance and availability characteristics. For example, the storage devices may be arranged to create individual volumes or concatenations of volumes, mirror sets or stripes of mirror sets, or even redundant arrays of independent disks (RAID). The computer system platform on which the operating system executes to provide such management functions typically includes a host computer coupled to a storage adapter or controller, which in turn manages storage volumes. The operating system functionally organizes this platform by, inter alia, invoking input/output (I/O) operations in support of software processes or applications executing on the computer.
A storage architecture decomposes management of the storage devices into individual components and defines their functional operations with respect to the flow of information and control among them. The individual components include an I/O subsystem and a file system, each of which is generally independent of one another and interact according to interfaces defined by the architecture. The I/O subsystem provides an efficient mode of communication between the computer and the storage devices that allows programs and data to be entered into the memory of the computer for processing. The subsystem also enables the results obtained from computations of that information to be recorded on the storage devices.
The file system contains general knowledge of the organization of information on the storage devices and provides algorithms that implement properties/performance of the desired storage architecture. To that end, the file system is a high-level software entity comprising a collection of program modules, e.g., software drivers that incorporate a command set for the storage devices.
A storage network may include one or more server computers, which are a source and repository for large blocks of data, and multiple client computers, which communicate with the servers, operate on smaller blocks of data, and transfer the edited data back to the servers. The server computers typically are capable of storing large amounts of data. Such storage can be achieved with a variety of data storage systems, including large magnetic and magneto-optical disk libraries and magnetic tape libraries.
A server may implement a file system, as discussed above, for managing the space of storage media. The file system provides a logical framework to the users of a computer system for accessing data stored in the storage media. The logical framework usually includes a hierarchy of directory structures to locate a collection of files that contain user-named programs or data. The use of directories and files removes the concern from the users of finding the actual physical locations of the stored information in a storage medium.
The logical framework may be stored as “metadata” or control information for the file such as file size and type and pointers to the actual data. The contents of a file may be called file data to distinguish it from metadata. Metadata is “data about data”. Metadata is the file system overhead that is used to keep track of everything about all of the files on a volume. For example, metadata tells what allocation units make up the file data for a given file, what allocation units are free, what allocation units contain bad sectors, and so on.
I/O processing is typically performed under the auspices of the file system in that applications typically interact with the file system to manipulate (i.e., read or write) the files. I/O subsystems, on the other hand, interact with storage devices at lower software levels by manipulating blocks of data.
The file system and I/O subsystem are composed of many layers of software driver code that is commonly referred to as an I/O stack. A conventional I/O stack may include a file system driver, a logical volume driver, a disk class driver and device-specific drivers, such as small computer system interface (SCSI) port and miniport drivers.
The organization of a file system and I/O subsystem within a hardware platform vary among conventional storage architectures. For example, traditional storage architecture, as described above, generally includes a file system and I/O subsystem that are organized to execute entirely on a host computer. In response to an I/O transaction request issued by an application, the host processor executes the software code of the file system and I/O subsystem needed to transfer data from storage devices to the host memory. In this architecture, the host processor actually executes the code of the I/O stack twice for the I/O transaction: once as the transaction descends the stack and again as the results of the transaction are returned to the application. Execution of I/O operations for this type of architecture clearly consumes significant computer resources.
To avoid such consumption of resources, some storage architectures alter the arrangement of their file systems and I/O subsystems. For example, a conventional RAID controller architecture may be provided wherein the file system is contained within the host computer and the I/O subsystem is distributed between the host computer and controller. Most implementations of this architecture are configured to execute RAID-related operations by transferring discrete block-oriented requests between the file system and controller. When these requests complete, however, the host processor is notified by means of interrupts, i.e., events that change the normal flow of instruction execution by the host processor. For this type of architecture, there may be many interrupts associated with a single transaction. Because each interrupt must be serviced by the host processor, this architecture results in inefficient use of the processor.
Other storage architectures provide their file systems and I/O subsystems entirely on the controller. For example, a host computer may interact with the controller in accordance with a conventional client-server computing model wherein the host computer (“client”) forwards each I/O transaction to the controller (“server”) typically across an interconnection such as a network. All transactions are sent to the controller and none are serviced locally at the host computer. The file controller which manages the file system of mass storage devices is coupled to the storage processors. Although this architecture relieves the host processor from I/O processing, it also adversely affects file system latency, i.e., the period of time between the issuance of an I/O transaction request by an application to the file system and the completion of that request by the file system.
More recently, a data server has been interfaced to a data network via at least one metadata server. The metadata server receives data access commands from clients in the data network in accordance with a network file access protocol. The metadata server performs file locking management and mapping of the network files to logical block addresses of storage in the data server, and moves data between the client and the storage in the data server. However, architectures that use a metadata server currently require the client operating system to provide data control and/or fail to provide file access at speeds of the data server connection.
It can be seen that there is a need for a method, apparatus and program storage device that provides a distributed file serving architecture with metadata storage virtualization and data access at the data server connection speed.