The present invention relates generally to resource allocation, and more particularly to allocating volumes of a mass storage device in a cluster environment.
In broad terms, a server cluster is a group of independent computer systems working together as a single server computer system. Moreover, a client computer system interacts with a server cluster as though the server cluster is a single server.
Server clusters address among other things two primary concerns: availability and scalability. Availability essentially is concerned with the up time of services provided by a server cluster. For example, if the server cluster provides a database service, then availability is concerned with the number of hours a day, week, or year that client computer systems may access and utilize the database service. One manner by which server clusters have improved availability is essentially by monitoring the status of the members (i.e. computer systems) of the sever cluster. If a server cluster determines that a member of the server cluster has failed, then the server cluster may cause the services provided by the failed member to be moved to another member of the server cluster. In this manner, as long as the server cluster has a member to which services may be moved, then the services remain available to client computer systems even though some members of the server cluster have failed.
Scalability on the other hand is concerned with the ability to increase the capabilities of the server cluster as client computer system demand increases. For example, if a server cluster currently provides several services to 10 clients, then scalability is concerned with the ease by which additional services may be added to the server cluster, and/or additional clients may be served by the server cluster. One manner server clusters have improved scalability of servers is by providing an easy mechanism by which additional processing power may be added to the server cluster. In particular, most server clusters today execute clustering software which enables additional computer systems to be added to a server cluster. This clustering software provides an easy mechanism for transferring a portion of the services provided by an overly burdened member to the added computer systems. In this manner, besides increases in performance, the client computer systems are unaware of the changes to the server cluster and need not be reconfigured in order to take advantage of the additional computer system of the server cluster.
Server clusters often include shared resources such as disks. As a result, server clusters need a mechanism to allocate the resources to the various members of the server cluster. One allocation approach referred to as the shared nothing approach allocates each computer system of the server cluster a subset of the shared resources. More particularly, only one computer system of the server cluster may own and access a particular shared resource at a time, although, on a failure, another dynamically determined system may take ownership of the resource. In addition, requests from clients are automatically routed to the computer system of the server cluster that owns the resource.
Data integrity problems have been encountered by a daemon implementation of the shared nothing approach. In the daemon implementation, a separate background service or daemon process is executed on each computer system of the server cluster. The daemon processes coordinate allocation of resources amongst the computer systems and ensure that a resource allocated to a computer system of the server cluster is not accessible to other computer systems of the server cluster. The problem with the daemon implementation is that the daemon process on any one of the computer systems of the server cluster may die (i.e. stop executing) without the corresponding computer system failing. As a result, the computer system corresponding to the dead daemon process may gain access and corrupt data of resources allocated to another computer system of the cluster.
Another problem with the daemon process implementation results from the fact that daemon process functionality is not immediately available. More particular, a finite amount of time exists between when execution of a daemon process starts and the functionality of the daemon process is available. Accordingly, even if execution of the daemon process is started at boot time (i.e. power up), a small window of time exists between power up of the computer system and daemon process functionality. During this window of time, a computer system may access and corrupt data of a shared resource allocated to another computer system.
Therefore, a need exists for a method and apparatus for allocating resources of a server cluster that alleviates the problems incurred by the above daemon process implementation.
In accordance with one embodiment of the present invention, there is provided a method of protecting volumes of a mass storage device shared by a server cluster. One step of the method includes the step of transferring from (i) a first file system of a first server of the server cluster to (ii) a first filter driver of the first server, a write request packet directed to a first volume of the mass storage device. Another step of the method includes determining at the first filter driver whether the first server has ownership of the first volume. Yet another step of the method includes transferring the write request packet from the first filter driver to a lower level driver for the mass storage device only if the determining step determines that the first server has ownership of the first volume.
Pursuant to another embodiment of the present invention, there is provided a filter driver for protecting volumes of a mass storage device shared by a server cluster. The filter driver includes instructions which when executed by a first server of the server cluster causes the first server to process a write request packet (i) directed to a first volume of the mass storage device, and (ii) received from a first file system of the first server. The instructions of the filter driver when executed by the first server further cause the first server to determine in response to processing the write request packet whether the first server has ownership of the first volume. Moreover, the instructions of the filter driver when executed by the first server further cause the first server to transfer the write request packet to a lower level driver for the mass storage device only if the first server determines that the first server has ownership of the first volume.
Pursuant to yet another embodiment of the present invention, there is provided a server cluster that includes a mass storage device having a plurality of volumes, a first server coupled to the mass storage device, and a second server coupled to the mass storage device. The first server includes a first file system, a first filter driver, and at least one first lower level driver for the mass storage device. The first filter driver is operable to process a first write request packet received from the first file system that is directed to a first volume of the plurality of volumes. The first filter driver is also operable to determine in response to processing the first write request packet whether the first server has ownership of the first volume. Moreover, the first filter driver is operable to transfer the first write request packet to the at least one lower level driver only if the first server has ownership of the first volume.
The second server includes a second file system, a second filter driver, and at least one second lower level driver for the mass storage device. The second filter driver is operable to process a second write request packet received from the second file system that is directed to the first volume of the plurality of volumes. The second filter driver is also operable to determine in response to processing the second write request packet whether the second server has ownership of the second volume. Furthermore, the second filter driver is operable to transfer the second write request packet to the at least one second lower level driver only if the second server has ownership of the first volume.
It is an object of the present invention to provide a new method and apparatus for protecting volumes of a disk in a server cluster environment.
It is an object of the present invention to provide an improved method and apparatus for protecting volumes of a disk in a server cluster environment.
It is yet another object of the present invention to provide a method and apparatus which maintain integrity of data stored on a storage device shared by servers of a server cluster.
It is still another object of the present invention to provide a method and apparatus which maintains data integrity of a shared storage device during the boot up process of the servers of a server cluster.
The above and other objects, features, and advantages of the present invention will become apparent from the following description and the attached drawings.