Field of the Invention
The present invention relates generally to a loosely-coupled multi-processor system that shares a commonly-used file, and it more particularly relates to processes that execute on each of the processors of the multi-processor system; the invention reduces the message traffic among the processors needed to achieve a single, consistent image of the commonly-used file.
FIG. 1 shows a system setting in which the present invention operates. In this figure there are a plurality of processing systems 10, 12, 14, 16, 18, preferably having a similar architecture, connected via a number of point-to-point connections 20, 22, 24, 26, 28, 30. One or more of the processing systems (storage systems) 16, 18 provide storage-related functions for the other processing systems (client systems) 10, 12, 14 and these storage systems 16, 18 are connected to one or more permanent storage devices 32, 34, 36, 38, such as hard disk drives. Each client system 10, 12, 14 is connected to each of the storage systems 16, 18, preferably using the point-to-point connections 20, 22, 24, 26, 28, 30 and the storage systems themselves are interconnected via a point-to-point connection 40 so that they can serve as a unified, redundant storage system for the client systems. (The storage systems are illustrated as distinct from the set of client processing systems, but the present invention does not depend upon this distinction.)
FIG. 2 shows a diagram of a representative computer system shown in FIG. 1 in which a central processing unit 60, a memory subsystem 62 and an I/O subsystem 64 are preferably interconnected preferably by point-to-point links 66, 68, 70. The representative computer system is connected, via a link 72, to the storage systems via the I/O subsystem. (While these diagrams illustrate point-to-point connections, the current invention is not limited to that topology.) The software on each client system in FIG. 1 includes a number of processes (client processes) 42, 44, 46 that execute on that system and each of these processes typically requires access to the file objects of the storage systems 16, 18. The client processes 42-46 make requests to obtain file objects from the storage systems by sending messages over the point-to-point links to a process called a disk process 48, 50 that executes on each of the storage systems. The disk process 48, 50, upon receipt of the message from a client process 42-46, sends reply messages to the message sender.
File objects, such as executables and library object files, that are requested by the client processes generally contain references that may need to be adjusted when the file object is downloaded on a particular client system so that it properly references other library files, possibly of a different version, on that client system. These references must be written into the contents of the file object and the adjustment must be synchronized with the other client systems so that the file object contents remains consistent. This means that each client process 42-46 that uses the file object must determine whether the contents of a file object are properly adjusted for the process environment that the file object will encounter on the particular client system 10-14. If a file object is currently loaded and in use by any client process, it cannot be changed, but is sharable as long as the other sharing client processes can use the file object with its current adjustments. It is necessary to have a protocol to determine when the current adjustments are appropriate and preserve that state, and to deal with the case in which a client process must adjust the contents of a file object for proper use within its processing environment.
A protocol for achieving such a modification that is consistent with the processing requirements of processes on the other client systems is shown in FIG. 3 and operates as follows. The client process opens the file object in step 80 and then locks the file object in step 82. This requires that a lock message be sent to the disk process of a storage system that maintains the consistency of the file object. (Once the file object is locked, other processes that attempt to lock the file are delayed until the lock is released.) Next, in step 84, the client process reads the attributes and relevant contents from the file object. If the content of the file is suitable for use, as determined in step 86, the file is unlocked in step 88 and a success indication is returned. If the file object is not properly adjusted (i.e., the content is not suitable), as determined in step 86, for the client system processing environment based on the contents read from the file object and if the file object is not in use as determined in step 90, an adjustment is made in step 92 and the changes are written back to the contents of the file object. The file object is then unlocked in step 88 and a success indication is returned. If the file object is in use, as determined in step 90, the file object is unlocked in step 94 and a failure indication is returned.
FIG. 4 shows a scheduling diagram of the prior art method for synchronization to more clearly illustrate the approximate timing of events at the client system and the storage system, and similar figures are used through out this specification to illustrate different aspects of the present invention. In FIG. 4, the upper line 100 represents an event line for the client system and the lower line or bar 102 represents an event line for the storage system. A line segment 104, 108, 112, 116 directed towards the storage system line indicates a message sent from the client system to the storage system (disk process) and a line segment 106, 110, 114 directed towards the client system represents a message sent from the storage system to the client system. The slope of the directed line segment simply indicates that the message travels at some finite speed between the two systems and the label on the directed line segment indicates the type of message being sent.
The first event 104 depicted in FIG. 4 is the client system transmitting an open request to the disk process of the storage system. This message is received and, in response, the disk process sends an open acknowledge message 106 back to the client system, which then proceeds to make a lock request 108. This message arrives at the disk process which then grants the request 110 to lock the file object. Following the receipt of the lock-granted message 110, a read request 112 is made of the file object by the client system to the storage system, and when the message arrives the storage system returns the file contents 114 that were requested back to the client system. The client system then determines whether the file object is properly adjusted for running in the environment of the client system and, in this example, finds that the file object is properly adjusted and no changes need to be written. Finally, an unlock message 116 is sent to the disk process releasing the file object. As is apparent from the scheduling diagram, the file object stays locked from the time of the lock grant 108 to the time that the unlock request 116 is received and executed at the disk process.
Though the above protocol is effective at maintaining the consistency of the shared file among the competing processes of the client systems, it is expensive in terms of the messages that are required to be sent to and from the disk process. Two messages, a lock and an unlock, are required by each competing process to determine whether the file is in proper condition for use by that process, regardless of whether or not the file contents must be adjusted. The protocol is also expensive in terms of the lack of concurrency that such a process causes to the competing processes because each process must lock the file in order to determine whether an adjustment is required. This does not permit any other process access to the file to determine if the condition of the file is proper for the other processes. If the process cannot obtain the lock because another process has the lock, it must wait for the lock to be released before it can even examine the file.
Therefore, there is a need for an improved protocol that reduces the message traffic to and from the disk process and improves the concurrency among the several client processes.
The present invention is directed towards the above need. It provides a method for sharing among a plurality of competing processes a file object that includes file contents and a state that describes whether the file contents are inconsistent and whether the file object is in the use of a competing process. The state has a value that is either xe2x80x98uncommittedxe2x80x99, xe2x80x98inconsistentxe2x80x99 or xe2x80x98committedxe2x80x99. The method includes determining the state value of the file object and whether or not the file content is suitable for use by a specific one of the competing processes. If the state value of the file object is not xe2x80x98committedxe2x80x99 and either the state value is xe2x80x98inconsistentxe2x80x99 or the file content is not suitable for use by the specific one of the competing processes, the method then obtains exclusive access to the file object, adjusts the contents of the file object, sets the state of the file object to xe2x80x98committedxe2x80x99, and relinquishes exclusive access to the file object. If the state value of the file object is not xe2x80x98committedxe2x80x99 and the state value is not xe2x80x98inconsistentxe2x80x99 and the file content is suitable for use by the specific one of the competing processes, the method sets the state of the file object to xe2x80x98committedxe2x80x99. If the state value of the file object is xe2x80x98committedxe2x80x99 and the file content is suitable for use by the specific process, the method shares the committed file; otherwise, the method returns a failure status.
One advantage of the present invention is that the message traffic is greatly reduced from two messages for each check of the shared file to either none or one message in the most common cases. One message is needed if the state value of the file object is xe2x80x98uncommittedxe2x80x99 and its contents are suitable for use. No message is needed is if the state value of the file object is xe2x80x98committedxe2x80x99 and the file content is suitable for shared use by the specific process. Only when the file must be adjusted are more messages required. However, that case occurs rarely.
Another advantage is that the client processes can each operate with a greater degree of concurrency because each of the client processes has access to the shared file without a lock being required in order to determine whether the file contents are suitable for use. In most cases the file is in the proper condition for that client process and needs no adjustment, which means that no locks are required and a process can continue its execution of the shared file without delay.