1. Field of the Invention
The present invention relates generally to the field of computer systems and more particularly to file servers for storage networks.
2. Description of the Prior Art
Communication networks continue to expand with a greater number of users accessing larger data files at faster speeds. Subsequently, file servers on these communication networks have also evolved to manage a greater number of files and handle a greater number of file requests from more nodes on the communication network. To meet this expanding demand, computer servers have been designed to act as file servers that “serve” files to users and/or devices connected to the communication network.
FIG. 1 depicts a symbolic diagram of a workstation used as a file server in the prior art. A system 100 represents a typical architecture of first generation file servers, which are basically high-end general-purpose workstations. One example of this system 100 is a workstation from Sun Microsystems. The file server of system 100 runs standard software but is dedicated to serving files from locally attached storage. The system 100 includes five main modules: a host CPU 110, a LAN controller 120, a SCSI adapter 130, a tape controller 140, and a disk controller 160. These five main modules are interconnected by a system bus 180.
The advantages of using standard workstations for file serving are relatively low development and production costs. The system 100 can expand local storage (usually externally) via the SCSI bus 132 and allows multiple and more efficient LAN controllers. The disadvantages of using a standard workstation as a file server are that performance and reliability are low because of the general-purpose operating system and software being utilized.
FIG. 2 shows a symbolic diagram of a dedicated file server in the prior art. The system 200 has an architecture in which the hardware and software are dedicated or customized to the file serving application. One example of the system 200 is a file server from Auspex Systems of Santa Clara, Calif. The system 200 includes five main modules: a host CPU 210, a network processor 220, a system memory 230, a file processor 240, and a storage processor 250. The five modules of system 200 are also interconnected by an embedded system bus 280. Specifically, the system memory 230 is accessible by all the modules via the embedded system bus 280.
The system 200 is characterized by the host CPU 210, the network processor 220, the file processor 240 and the storage processor 250, which are dedicated to running only very specific functions. For example, the network processor 220 executes the networking protocols specifically related to file access; the storage processor 250 executes the storage protocols; the file processor 240 executes the file system procedures; and the host CPU 210 executes the remaining software functions, including non-file networking protocols. The system memory 230 buffers data between the Ethernet LAN network 222 and the disk 270, and the system memory 230 also serves as a cache for the system 200.
Because of the way the software of the system 200 is partitioned, the system 200 can be viewed as two distinct sub-systems: a host sub-system (running a general-purpose operating system (OS)) and an embedded sub-system. The advantage of using the system 200 as dedicated for file serving is principally greater performance than that which could be obtained with standard workstations of the period. Although the performance of the system 200 is greater than previous architectures such as system 100, the cost of the system 200 is much greater, and the expanding application of network file servers creates a demand for a system with an improved performance/cost ratio.
FIG. 3 depicts a symbolic diagram for a system 300 for a file server appliance in the prior art. This system 300 is built from standard computer server motherboard designs but with fully customized software. One example of the system 300 is a file server from Network Appliance of Sunnyvale, Calif. The system 300 includes four main modules: a host CPU 310, a LAN controller 320, a SCSI controller 340, and a system memory 330 that is accessible by all the modules via a system bus 370.
The host CPU 310 controls the system 300 and executes software functions using networking protocols, storage protocols, and file system procedures. The host CPU 310 has its own buses for accessing instruction and data memories, and a separate system bus is used for interconnecting the I/O devices. The SCSI controller 340 interfaces with the disk 360 and the tape 350 on each of the SCSI storage buses 352 and 362, respectively. The advantage of using a dedicated software system on a general-purpose hardware platform is an improved performance/cost ratio and improved reliability since the software is tailored only to this specific application's requirements. The major disadvantage of the system 300 is limited performance, scalability, and connectivity.
The expansion of communication networks has driven the development of storage environments. One such storage environment is called a Storage Area Network (SAN). A SAN is a network that interconnects servers and storage allowing data to be stored, retrieved, backed up, restored, and archived. Most SANs are based upon Fibre Channel and SCSI standards.
FIG. 4 depicts a symbolic diagram of a system 400 with network-attached storage (NAS) filers 410, 420, 430, 440, 450, and 460 for a SAN in the prior art. A NAS is a computer server dedicated to nothing more than file sharing. The NAS filers 410, 420, 430, 440, 450, and 460 are simple to deploy, scalable to multiple terabytes, and optimized for file sharing among heterogeneous clients. However, data-intensive applications can quickly saturate the performance and capacity limits of conventional NAS devices. When this happens, the only solution has been to add servers, effectively adding islands of data. Numerous islands of data forces users to divide and allocate their data to a large number of file servers, thus increasing costs.
Another disadvantage of the NAS filers 410, 420, 430, 440, 450, and 460 is the high management overhead because each device and its associated set of users must be individually managed. As the number of devices grows, the required management bandwidth grows accordingly. Another disadvantage of the NAS filers 410, 420, 430, 440, 450, and 460 is the inflexibility of resource deployment. In environments with multiple NAS filers such as system 400, migrating users and data among servers is a cumbersome process requiring movement of data and disruption to users. Consequently, IT managers tend to reserve some performance and capacity headroom on each device to accommodate changes in demand. This reserved headroom results in a collective over-provisioning that further exacerbates capital and overhead management issues.
What is needed is a file server with an architecture that provides improved scalability in performance, capacity, and connectivity required to interface clients to a storage network.
File servers provide file services to the client such as reading and writing data to and from the storage network. Other file services may include opening files on the storage networks and displaying a tree directory of files on the storage network. Clients, file servers, and devices on the storage network share files by using file sharing protocols such as Common Internet File System (CIFS) protocols and Network File System (NFS) protocols. One goal in providing these file services is to achieve high availability such as 99.999% availability. Unfortunately, the file servers occasionally encounter expected or unexpected problems or delays. For example, a file server occasionally needs to be taken out of service for maintenance. In another example, the file server suffers an unexpected technical problem. Other times, the file servers get overloaded with a number of users, file requests, or connections.
Consequently, a need arises to transfer the file service to another file server to provide a continuous, uninterrupted file service and achieve high availability of the storage network. When transfers of file services are not handled properly, the user experiences delays or undesired results. In one Windows example, when a CIFS service is transferred from one file server to another, the user experiences a pop-up window displaying an error message, and the location of where the user is in the tree directory of files is lost.
One problem with transferring the file services from one file server to another is that the state information is located on both the filer server and the client computer. When transferring the file service, the state information in the file server also needs to be transferred to another file server. Another drawback is that the state information on one machine is not comprehensive for the entire file service. Therefore, the state information in the file server cannot be recreated from the state information in the client computer, which makes transferring the state information on the file server a necessity when transferring file services to another file server. The two sets of state information are complementary and as a whole represent the state of the file service. In one example, a Windows client does not cooperate like a Network File Service client and does not keep enough state information to replay where the Windows client is.
Another problem is caused by the immense amount of data stored on the storage area network. The file server generates a tremendous amount of file management data in order to track and maintain the file access. Some examples of the file management data are file control blocks, file name service, and open file handle. The file management data keeps track of opened files, byte range blocks, and in some cases, tree directories. In one example, the file management data numbers in the millions of objects. When a file service transfer is needed, copying millions of objects individually is impractical, especially in the short period of time that is acceptable for a file service transfer. A user accessing files through the file service may only tolerate a few seconds of delay. In some cases, the allowable time for a file service transfer is less than 10 to 30 seconds.
In one prior art system for checkpointing, one file server copies state information for a file service to another file server during the file service. When the first file server malfunctions, the second file server can continue with the file service because the second file service has a copy of the state information. Maintaining the coherency between the two file servers can be very expensive. Also, copying the state information during the file services reduces the performance of both file servers. Both problems of reduction of performance and increased cost do not make this prior art system practical.
Another problem with this prior art system is the determination of which file server will be the recipient of the file service is made a priori to the conditioning event that necessitates the transfer of the file service. In this prior art system, when the first file server boots up, a connection to the second file server is established. The problem is that the decision to use the second file server is made well before the conditioning event. At the time of the conditioning event, the second file server may be unable to accept the file service due to an overloaded state.
What is needed is a quick, efficient solution for transferring file services between file servers that is transparent to the user.