The present invention generally relates to file storage systems. More specifically, the present invention relates to a system and method for optimizing data access pertaining to a file sharing system with data mirroring by file storage systems.
Generally, a file storage system contains many files which are concurrently accessible to many hosts or clients. Many kinds of applications may be running on the hosts and some applications may share files and process some tasks in cooperation. One example of such applications is a clustered video server system. A typical task of the video server system is to send streaming video data to a browsing client through a computer network. The size of the video data is usually very large in the range of several gigabytes or more. Consequently, sending video data is often a time-consuming process. It is, therefore, inefficient to have only one video server handle sending of video data to multiple hosts.
A typical solution used to overcome this limitation is to utilize multiple video servers and have them send video data in a parallel or concurrent manner. As mentioned above, the size of video data is usually large. Consequently, if a copy of the video data is to be stored on the local disk drive of each video server, a lot of disk capacity is required for each video server and the associated costs for maintaining the needed disk capacity on each video server are also quite high. As a result, it is a general approach to have multiple video servers share video data. FIG. 1 depicts a typical system configuration where a single storage system containing video data is shared by multiple video servers.
The single storage system, as shown in FIG. 1, suffers from at least four problems, namely, (1) performance bottleneck caused by storing video data in a single storage system, (2) performance bottleneck caused by storing video data on a single disk drive, (3) a single point of failure caused by storing video data in a single storage system, and (4) a single point of failure caused by storing video data on a single disk drive.
From a performance point of view, it is easy to see that the single storage system is constrained by a number of performance bottlenecks. For example, one performance bottleneck is the finite I/O throughput of the storage system. The I/O throughput of the storage system is not unlimited. Thus, as the number of browsing clients continues to increase, the I/O throughput of the storage system will at some point reach a maximum level thereby leaving the demands of some clients unfulfilled.
Another performance bottleneck resides inside of the storage system. The storage system contains many physical disk drives used to store data. The I/O throughput of a single disk drive is small. Hence, a single disk drive may be a performance bottleneck of the video server system if the same video data stored on that single disk drive are requested by various browsing clients at the same time.
From the system availability point of view, it is clear that the single storage system presents a single point of failure. If a problem disables the single storage system, then no video data from that system will be available. Similarly, storing video data on a single disk drive also presents a single point of failure. If a single disk drive is disabled, then the video data stored thereon are rendered unavailable.
Efforts have been made to attempt to overcome the problems associated with the single storage system mentioned above. For example, a method called redundant-arrays-of-inexpensive-disks (more commonly known as RAID) is a method used to store data on a group of disk drives so as to provide data redundancy. The basic premise of RAID is to store the same data on different disk drives within a disk drive group. By storing redundant data on different disk drives, data is available even if one of the disk drives in the disk drive group is disabled. RAID, therefore, helps resolve the problem of a single point of failure caused by storing data in a single disk drive.
RAID also provides improved performance with respect to data access. RAID 1 is a method used to make multiple copies of data onto different disk drives. In other words, each disk drive in the disk drive group has the same data. This is called data mirroring. When host computers read data stored on disk drives configured as RAID 1, one or more of the disk drives in the group are used to service the read requests in a parallel manner. By servicing the read requests from the host computers in parallel, the total data throughput is increased. Hence, the performance bottleneck caused by storing data on a single disk drive is alleviated.
There are generally two ways to implement RAID. First, RAID can be implemented in a storage system. The storage system generally includes a disk controller which manages groups of disk drives and determines how they are configured. The disk controller receives read requests from a host computer and determines how the read requests should be satisfied by the groups of disk drives in a balanced manner. That is, the disk controller performs a load balancing function so as to ensure that the groups of disk drives are utilized in an efficient manner.
Second, RAID can also be implemented in host computers. Each host computer generally includes a logical volume management system (LVMS) in its operating system. The LVMS performs the same functions as the storage system.
RAID in the storage system approach and RAID in the LVMS approach both solve the problems of a performance bottleneck and a single point of failure caused by storing data on a single disk drive. However, RAID in the storage system approach cannot be used to solve the problems of a performance bottleneck and a single point of failure caused by storing data in a single storage system. This is because RAID is limited to configuration within a single storage system itself. On the other hand, LVMS is not limited to a single storage system but can be applied to multiple storage systems. This means that LVMS can use disk drives in different storage systems as a group and can configure RAID 1 for a particular group.
FIG. 2 illustrates an example of disk mirroring configuration on multiple storage systems managed by LVMS. A group of two or more disk drives is defined as a pair. In this example, there are three pairs, Pair1, Pair2 and Pair3. Pair1 has three disk drives that are on three different storage systems. LVMS makes copies of the data, file A in this example, on to each disk drive in Pair1. To make the copies, LVMS issues the same data-write requests to each storage system at the same time. Similar to Pair1, Pair2 has two disk drives on two different storage systems. Pair3, on the other hand, is a different case. Pair3 has two disk drives which reside on the same storage system. This case is used solely for solving problems (2) and (4) identified above, namely, the performance bottleneck and the single point of failure caused by storing data on a single disk drive. Whether a system designer uses the case of Pair3 depends on a number of factors, such as, the availability of the storage system, how much money can be used for building the storage system, etc.
As described above, it seems that LVMS is able to solve problems (1) to (4) identified above. However, LVMS has two major problems. One of the problems is that issuing multiple data-write requests to storage systems causes a large CPU load on the host computers. For example, if there are ten disk drives to be mirrored, the host computers must then issue ten data-write requests at the same time. In contrast, in a system environment that has a single storage system and uses RAID in the storage system approach, data mirroring is processed by a disk controller in the storage system. The disk controller manages the data-write requests thereby reducing the CPU load of the host computers.
Another problem associated with LVMS is that there is no mechanism to distribute I/O requests among the multiple storage systems. Data mirroring configuration is independently managed by each host computer and each host computer does not know which disk drives may be currently used by other host computers. In a worst case scenario, all the host computers may end up using the same storage system at the same time although data are mirrored among multiple storage systems. Therefore, it would be desirable to develop a file sharing system which is capable of optimizing data access and providing other benefits.