In various computing environments, a request by one requester (e.g., a client) for a resource cannot be satisfied until control of that resource is relinquished by a previous requester. One such computing environment is the Distributed File System (DFS) offered by International Business Machines Corporation (IBM).
The DFS product, which supports Server Message Block (SMB) clients, is used by, for example, the OS/390 operating system of IBM to provide a file serving protocol. DFS and specifically, SMB (also known as Common Internet File System (CIFS)), allows clients to cache data via a form of locking, called an opportunistic lock (oplock). An oplock is requested on the file open SMB request and the server grants an oplock depending on whether other clients have the file open at the same time or not. If a client has an oplock, then that client can cache file data and/or byte range lock requests for that file, and can perform read-ahead and write-behind optimizations.
Oplocks are broken by the server when another client attempts to open the file or when another client requests an operation that might change the file, such as a rename or delete. In these cases, the server sends a callback to the client (called an oplock break) that tells the client it lost its oplock. The client responds either with a file close SMB or an oplock break notification via a lockingX SMB. However, if the client has dirty data or cached byte range locks, it is allowed to flush the data and obtain byte range locks before it closes the file or sends the break notification via the lockingX SMB. The client request (or requests) that forced the server to break the other client's oplock is made to wait for the oplock notification or file close from the original client that held the oplock.
Since any client request could possibly wait for a callback and response from one or more clients, there must be processing threads available to handle the callback response(s) from the client(s) holding the oplock or a deadlock could occur. A single thread pool, no matter what the size, cannot solve the problem. For example, assume that Client A currently has a hold (e.g., an oplock or a token) on Resource X (e.g., a file) and is currently updating Resource X. Then, Client B requests that resource. The server breaks the oplock for Resource X, even though Client A is not done updating Resource X. Eventually, Client A sends a response to the callback; however, there may be no threads in the thread pool to handle the response, since all of the threads are already processing client requests that are waiting for the callback response from Client A.
At least two approaches have been taken in an attempt to avoid the above deadlock situation. One approach is referred to as the single pool approach. With the single pool approach, when a thread is to wait for an oplock response, rather than blocking the thread, the thread is made available to process other requests. Hence, the state of the in-progress operation is saved and the thread is made available to process another request, including an oplock break response. Thus, with this approach, the state must be maintained for each operation. Further, to make a thread available again for processing requires that each routine called up to the point where it has detected an oplock break is needed, would have to be prepared for a special return code from its callers to see if the operation was completed or was simply placed on hold due to an oplock break response, and each routine would have to return to its caller to collapse the program stack on the thread to make it available.
This approach increases the complexity of the code that processes the individual SMBs and also adds increased path length, since many routines in the path would have to update a state block and also check upon the return of a called routine to determine whether the request was processed or whether it was queued waiting on an oplock break response. Hence, this approach is considered expensive and a degradation of system performance.
Another approach is a dual pool approach. With this approach, a primary thread pool handles any client requests, while a secondary thread pool only handles requests that could not possibly block waiting to obtain a hold on a resource. Hence, the secondary pool handles requests that were guaranteed not to wait on resources, such as requests to store back data or release resources. For this approach, the client indicated on the request whether it was eligible for the secondary pool or not. Thus, it was the responsibility of the client software to provide an indication of what thread pool was to be used by the server and to provide this indication in the request itself.
The dual thread pool scheme avoids deadlock without the expense of implementation or performance degradation of the single pool scheme. However, the problem with the dual pool approach is that client software is required to indicate which thread pools are eligible for the server to use on an individual request basis. Thus, special client software is needed, which results in extra administrative expense to customers.
Based on the foregoing, a need still exists for an approach to avoid deadlocks, which is more efficient, simpler and less expensive than previous approaches, and does not require additional or special client software. A further need exists for an approach that enables the dynamic assignment of thread pools.