1. Field of the Invention
The present invention relates generally to data storage systems, and more particularly to network file servers.
2. Background Art
Mainframe data processing, and more recently distributed computing, have required increasingly large amounts of data storage. This data storage is most economically provided by an array of low-cost disk drives integrated with a large semiconductor cache memory. Such cached disk arrays were originally introduced for use with IBM host computers. A channel director in the cached disk array executed channel commands received over a channel from the host computer.
More recently, the cached disk array has been interfaced to a data network via at least one data mover computer. The data mover computer receives data access commands from clients in the data network in accordance with a network file access protocol such as the Network File System (NFS). (NFS is described, for example, in RFC 1094, Sun Microsystems, Inc., “NFS: Network File Systems Protocol Specification,” Mar. 1, 1989.) The data mover computer performs file locking management and mapping of the network files to logical block addresses of storage in the cached disk storage subsystem, and moves data between the client and the storage in the cached disk storage subsystem.
In relatively large networks, it is desirable to have multiple data mover computers that access one or more cached disk storage subsystems. Each data mover computer provides at least one network port for servicing client requests. Each data mover computer is relatively inexpensive compared to a cached disk storage subsystem. Therefore, multiple data movers can be added easily until the cached disk storage subsystem becomes a bottleneck to data access. If additional storage capacity or performance is needed, an additional cached disk storage subsystem can be added. Such a storage system is described in Vishlitzky et al. U.S. Pat. No. 5,737,747 issued Apr. 7, 1998, entitled “Prefetching to Service Multiple Video Streams from an Integrated Cached Disk Array,” incorporated herein by reference.
Unfortunately, data consistency problems may arise if concurrent client access to a read/write file is permitted through more than one data mover. These data consistency problems can be solved in a number of ways. For example, as described in Vahalia et al., U.S. Pat. No. 5,893,140 issued Apr. 6, 1999 [Ser. No. 08/747,631 filed Nov. 13, 1996], entitled “File Server Having a File System Cache and Protocol for Truly Safe Asynchronous Writes,” incorporated herein by reference, locking information can be stored in the cached disk array, or cached in the data mover computers if a cache coherency scheme is used to maintain consistent locking data in the caches of the data mover computers. However, as shown in FIG. 1, labeled “Prior Art,” a more elegant solution to the data consistency problem has been implemented at EMC Corporation in a network file server system having multiple stream server computers and one or more cached disk arrays.
FIG. 1 shows a network file server system having at least two data mover computers 21 and 22. The first data mover 21 has exclusive access to read/write files in a first file system 23, and the second data mover 22 has exclusive access to read/write files in a second file system 24. As shown, the file systems 12, 14 are respective volumes of data contained in the same cached disk array 25, although alternatively each file system 12, 14 could be contained in a respective one of two separate cached disk arrays. For example, each of the data mover computers 21, 22 has a respective high-speed data link to a respective port of the cached disk array 25. The cached disk array 25 is configured so that the file system 23 is accessible only through the data port connected to the first data mover 21 and so that the file system 24 is accessible only through the data port connected to second data mover 22. Each of the data movers 21, 22 maintains a directory of the data mover ownership of all of the files in the first and second file systems 23, 24. In other words, each of the data movers maintains a copy of the file system configuration information in order to recognize which data mover in the system has exclusive access to a specified read/write file.
Each of the data movers 21, 22 may receive file access requests from at least one network client. For example, the first data mover 21 has a network port 28 for receiving file access requests from a first client 26, and the second data mover 22 has a network port 29 for receiving file access requests from a second client 27. The clients 26, 27 communicate with the data movers using the connection-oriented NFS protocol. Whenever the data mover 21 receives a file access request from the client 26, it checks the configuration directory to determine whether or not the file specified by the request is in a file system owned by the data mover 21. If so, then the data mover 21 places a lock on the specified file, accesses the file in the file system 23, and streams any read/write data between the client 26 and the file system 23. If the file specified by the request is not a file system owned by the data mover 21, then the data mover 21 forwards the request to the data mover that owns the file system to be accessed. For example, if the client 26 requests access to a file in the file system 24, then the first data mover 21 forwards the file access request to the second data mover 22. The second data mover 22 places a lock on the file to be accessed, the second data mover accesses the file, and the second data mover streams any read/write data between the first data mover 21 and the file in the file system 24. The first data mover then streams the read/write data between the second data mover 22 and the client 26. The second data mover 22 responds to file access requests from its client 27 in a similar fashion, by directly servicing file access request to files in the file system 24 that it owns, or forwarding to other data movers the requests for access to the files in file systems that it does not own.
The solution as shown in FIG. 1 is rather efficient because the data movers 21, 22 can be linked by a dedicated high-speed data link for the exchange of read/write data between them. Therefore, there is no additional loading of the data network between the data movers and the clients and no additional loading of the data links between the cached disk array 25 and the data movers 21, 22. The data movers can cache the file access information (e.g., file locks) and file data and attributes for the files that they own, so that the loading on the data links between the cached disk array and the data movers 21, 22 can be somewhat reduced. In the network file system implemented at EMC Corporation, when a data mover did not own the file system to be accessed, the data mover forwarded to or exchanged NFS data packets with the data mover that owned the file system to be accessed. Such a system was relatively easy to implement, since it involved creating a proxy router routine that would recognize whether or not a NFS data packet from a client was for access to a file system owned by another data mover, and if so, routing the data packet to the data mover that owned the file system. The data mover owning the file system could treat the forwarded data packet in a fashion similar to a data packet received directly from a client.
Although the system of FIG. 1 is satisfactory for handling NFS file access requests, it has a number of limitations that will become increasingly significant. The current trend is toward higher-speed network links and interconnection technology, such as technology for the Fibre-Channel standards being developed by the American National Standards Institute (ANSI). In a network employing high-speed links and interconnection technology, the delays inherent in a connectionless communications protocol such as NFS become more pronounced.
The Internet uses a connection-oriented protocol known as the Transmission Control Protocol (TCP/IP). In order to provide read/write file sharing over the Internet, the Internet Network Working Group has drafted a specification for a Common Internet File System (CIFS) Protocol. The CIFS protocol is described, for example, in Paul L. Leach and Dilip C. Naik, “A Common Internet File System,” Microsoft Corporation, Dec. 19, 1997, incorporated herein by reference. The status of development of CIFS is posted on the Internet at http://www.microsoft.com/workshop/networking/cifs/default.asp. CIFS is touted as incorporating the same high-performance, multi-user read and write operations, locking, and file-sharing semantics that are the backbone of today's sophisticated enterprise computer networks.
According to the CIFS protocol specification of Leach and Naik, p. 14–15, protocol dialects of NT LM 0.12 and later support distributed file system operations. The distributed file system is said to give a way for this protocol to use a single consistent file naming scheme which may span a collection of different servers and shares. The distributed file system model employed is a referral-based model. This protocol specifies the manner in which clients receive referrals. The client can set a flag in the request server message block (SMB) header indicating that the client wants the server to resolve this SMB's paths within the distributed file system known to the server. The server attempts to resolve the requested name to a file contained within the local directory tree indicated by the tree identifier (TID) of the request and proceeds normally. If the request pathname resolves to a file on a different system, the server returns the following error: “STATUS_DFS_PATH_NOT_COVERED—the server does not support the part of the DFS namespace needed to resolved the pathname in the request.” The client should request a referral from this server for further information. A client asks for a referral with the TRANS2_DFS_GET_REFERRAL request containing the DFS pathname of interest. The response from the server indicates how the client should proceed. The method by which the topological knowledge of the DFS is stored and maintained by the servers is not specified by this protocol.