1. Field of the Invention
The present invention relates to data storage networks, and especially networks implementing SAN (Storage Area Network) functionality. More particularly, the invention concerns data storage networks in which heterogeneous network hosts need to access heterogeneous network storage devices using a distributed file system.
2. Description of the Prior Art
By way of background, data storage networks, such as SAN systems, provide an environment in which data storage devices are managed within a network environment. Access to the data storage devices is provided via one or more storage manager servers that handle data storage requests (e.g., copy, backup, etc.) from data client nodes (data clients) via conventional LAN (Local Area Network) or WAN (Wide Area Network) connections.
The storage manager servers are programmed data processing platforms that maintain network interface connections to the client nodes and to the storage devices that define the data storage network's pool of peripheral storage. They are commonly implemented as database servers, file servers, application servers, and for other purposes.
The data network storage devices may include any number of interconnected data storage subsystems comprising magnetic disk drive arrays, optical disk drive arrays, magnetic tape libraries, etc. In some cases, the data storage network is superimposed on the LAN or WAN that hosts the client nodes, such that the network carries both data storage traffic (e.g., TCP/IP-encapsulated i-SCSI packets) and conventional client traffic (e.g., TCP/IP). More commonly, the data storage network is a dedicated network that carries only data storage traffic. In all but the smallest dedicated data storage networks, the required inter-connectivity is provided by way of high speed arbitrated loop arrangements or switching fabrics implementing the Fibre Channel protocol, with the latter being more common. Typical inter-connectivity components include copper or fiber optic cables, hubs, bridges, gateways, switches, directors, and other telecommunication equipment designed for high speed data transfer between and among all of the interconnected storage manager servers and storage devices that comprise the data storage network. Dedicated Ethernet data storage networks may also be implemented using conventional Ethernet hardware and the i-SCSI protocol (i.e., wherein block I/O and SCSI commands are encapsulated in TCP/IP packets).
FIG. 1 shows an exemplary dedicated data storage network 2 that includes a pair of storage manager servers 4 and 6 interconnected to a pair of data storage devices 8 and 10 by way of a high-speed SAN connectivity scheme. It will be appreciated that in an actual data storage network, many additional storage devices could be present. There could likewise be additional storage manager servers. It should be further understood that the individual connection components that comprise the network fabric itself, such as switches, directors, hubs, links, etc., are not shown in FIG. 1. The storage manager servers 4 and 8 additionally communicate with a local area network (LAN) 12 (or alternatively a WAN) that comprises one or more data processing clients, two of which are identified as client systems 14 and 16. Data sets associated with the client systems 14 and 16 can be stored in data volumes logically defined on one or both of the storage devices 8 and 10 by way of the storage manager servers 4 and 6.
The data storage devices found in a data storage network such as that shown in FIG. 1 are in many cases comprised of a heterogeneous assortment of equipment from different vendors. Similarly, the storage manager servers can be implemented using heterogeneous data processing platforms running different operating systems. Thus, in FIG. 1, the data storage devices 8 and 10 might be RAID (Redundant Array of Inexpensive Disks) subsystems from two different vendors, or one might be a RAID array subsystem while the other is a JBOD (Just a Bunch Of Disks) subsystem. Similarly, the storage manager server 4 might be implemented using an Intel x86 processor running a Microsoft Windows® operating system, while the storage manager server 6 might comprise a PowerPC® processor running an IBM AIX® operating system.
Historically, the interoperability between data storage equipment from different vendors has been weak. One way to accommodate such incompatibilities in a data storage network is to partition the network storage devices into homogeneous groups or “islands,” and to assign such partitions to compatible storage manager servers. Each storage manager server runs its own file system and is allotted its own set of storage devices for its exclusive use within the data storage network. For example, in FIG. 1, storage manager server 4 might be allocated to data storage device 8 while storage manager server 6 is assigned to data storage device 10.
The foregoing approach can be limiting from a data storage network user's point of view because data is managed on a server-by-server basis instead of at a network-wide level. To address this concern, the concept of a distributed file system has been proposed. The goal of a distributed file system in a data storage network is to provide benefits such as a global namespace for files, shared access from any storage manager server to any network storage device, and centralized, policy-based management. This guarantees that data network clients will be able to access their files using the same filenames regardless of which storage manager server is used to retrieve the data, and no matter which storage device stores the data.
A distributed file system also facilitates what is referred to in the data storage art as “out-of-band storage virtualization.” The IBM TotalStorage® SAN File System is an exemplary distributed file system that is based on this approach. The term “out-of-band storage virtualization” means that user data and metadata are stored in different places. In the data storage art, the term “user” data refers to files and their information contents, whereas the term “metadata” (literally “data about data”) refers to the information needed to manage and maintain such files. This includes file names, ownership, access permissions, physical location, and other characteristics. Metadata in an out-of-band storage virtualization environment is handled by a dedicated metadata server that cooperates with the storage manager servers to implement data storage and retrieval operations the data storage network. In FIG. 1, a metadata server (hereinafter metadata “manager”) is shown by reference numeral 18. The metadata manager 18 communicates with the storage manager servers 4 and 6 via a dedicated control network 20, using TCP/IP packet communication or the like. The connectivity infrastructure of the data storage network 2, or the LAN/WAN 12, could also support this communication. The metadata handled by the metadata manager 18 is stored in data storage devices within the data storage network 2. Thus, in FIG. 1, the data storage devices 8 and 10 could each have one or more data volumes dedicated to the storage of metadata.
The metadata manager 18 processes metadata requests from the storage manager servers 4 and 6, which may be thought of as “clients” of the metadata manager's server functions. When a data file transfer request is made to one of the storage manager servers 4 or 6, it queries the metadata manager 18 to determine the file's location, and other control information. Once the storage manager server 4 or 6 has obtained access to the file, it performs the required data transfer operation without further intervention by the metadata manager 18.
One of the implications of a distributed file system with out-of-band storage virtualization is that incompatibilities arising from the use of heterogeneous equipment within a data storage network become an issue that must be addressed. Among other things, the metadata manager 18 requires access to both user data and metadata in order to support various management functions, such as commissioning and decommissioning data volumes, and moving files between different volumes or between different storage pools (i.e., volume groups), and restore and backup. If the user data and the metadata of a data storage network are located on heterogeneous storage devices, as is the case in FIG. 1, the metadata manager 18 must be capable of accessing those devices, and the devices must be capable of handling requests from both the metadata manager 18 and the storage manager servers 4 and 6. Incompatibilities between products from various vendors can severely limit the implementation of these functions.
It is to improvements in distributed file system implementation that the present invention is directed. In particular, what is needed is a distributed file system architecture that supports a data storage network comprising heterogeneous equipment.