In a multiprocessor system with a plurality of central processors (CP's) connected to a common storage controller (SC), multiple requesters may be attempting to access the resources controlled by the SC in a given cycle. The SC controls access to common storage, in the form of a store-in level 2 (L2) cache. The SC also controls access to main memory. Both the L2 cache and the main memory are divided into independent sections known as interleaves to allow concurrent access by multiple requests.
In a IBM multiprocessor system under development, requests may originate from sources external to the SC, namely the CP's, the I/O adapters, and, in the case of a system consisting of multiple CP/SC units, another SC. Requests may also originate from logic stations internal to the SC, specifically, the logic which processes fetch requests from the SC to the main memory and the logic which processes stores from the SC to the main memory.
There are two processing pipelines within the SC. Requests must be granted priority before they can be gated into one of the pipelines to begin executing. In most cases, a request will complete its execution after a single pass through the pipeline. In other cases, a request will have its processing interrupted for some reason, and will have to make additional pipeline passes. Priority must be granted for each pipeline pass.
Each SC is attached to six CP's, which originate fetch and store requests to the SC. Each CP contains a store-through level 1 (L1) cache, which results in a large amount of store traffic being sent from the CP to the SC's store-in L2 cache. CP store requests target a specific pipeline within the SC, while fetch requests may target either pipeline. The SC contains one dedicated fetch request register for each CP, and two stacks per CP for store requests, one per pipeline, each of which can hold up to eight store requests. The oldest store for a given CP must always be processed first, so only one store per CP per pipeline can be valid for priority. Thus, there can be up to twelve CP requests (six fetches, six stores) competing for access to one pipeline within the SC in a given cycle.
There are two I/O adapters attached to each SC, which can also send both fetch and store requests. There are four dedicated I/O request registers in the SC for each pipeline, two per I/O adapter, which can be used for either type of request. This can result in four I/O requests competing for priority for a given pipeline in the same cycle.
Requests from the remote SC may be fetch or store commands. For each pipeline, there are four remote request registers in the SC, two dedicated to fetches and two dedicated to stores. Thus, there can be four remote requests competing for access to one pipeline in the same cycle.
Fetch requests from the SC to the main memory are processed by the hardware facilities known as the line fetch address registers (LFAR's). There are four of these per pipeline, all of which can be making requests simultaneously. Similarly, there are hardware facilities to handle the stores from the SC to the main memory necessitated by the store-in design of the L2 cache. These are the line store address registers (LSAR's). There are four per pipeline, and they can all be competing for priority in a given cycle.
It can be seen that there may be as many as 28 valid requests in the SC competing for priority for one pipeline in a given cycle--six CP fetches, six CP stores, four I/O adapter requests, four remote SC requests, four LFAR requests, and four LSAR requests. There may be as many as 50 valid requests overall (the six fetch requests may be to either pipe). Requests of the same type compete with each other for priority first, in the "pre-priority" stations. One request of a given type, for example, one CP fetch request, is chosen by the pre-priority logic and sent to the overall priority arbitration logic, where it competes for priority with other types of requests. The pre-priority stations employ standard priority algorithms such as round-robin or pseudo-LRU (the request which completed an operation most recently has the lowest priority). Different pre-priority stations use different algorithms, to optimize performance for the specific request type being processed.
The overall priority arbitration logic uses a ranked priority order scheme, that is, each category of request has a fixed priority relative to other request types. The priority order is assigned in terms of relative frequency of operations. This is done to prevent less-frequent operations from being locked out by more-frequent operations. Requests from the remote SC, the least frequent operations, have highest priority. Next in priority are requests from LFAR and LSAR, followed by I/O adapter requests, CP fetches, and CP stores, which have the lowest priority because they are the most frequent operations.
The overall priority arbitration logic selects up to two requests, one per pipeline, per cycle. Once a request is selected, its associated address and control information are gated into one of two internal processing pipelines within the SC, and it begins to execute. As mentioned previously, a request will usually complete its processing during a single pipeline pass. If a request's execution is interrupted for some reason, it must go through pre-priority and overall priority arbitration again for each additional pipeline pass.
As a request executes, it will utilize resources within the SC, such as the cache interleaves, or the hardware facilities (LFAR's) used to fetch data from main store to the SC. Those resources will be unavailable for the use of other requestors for one or more cycles. Since some checking for resource availability is done during pre-priority arbitration, and some is done during operation execution, this may prevent other requesters from getting priority or from completing their processing.
Deadlocks can occur among SC requesters in two ways: 1) higher-priority requestors may use Up priority grant cycles, preventing lower-priority requesters from receiving a grant, and 2) a sequence of requests may busy resources in the SC in such a way that other operations are unable to request priority, or are unable to complete their execution even if they have gotten priority. Case (2) can result in the expected case of higher-priority requests blocking lower-priority ones, or the less-expected case of requests of equal priority (for example, CP fetches) locking each other out, or even the case of lower-priority requests locking out higher-priority ones. An example of case (1) which has been observed is that of I/O requests occurring in a large burst, and as a result locking out CP fetches.
In extreme cases, a lockout situation can result in a severe recovery action and possibly a system outage. Experience has shown that regardless of how much care is taken to create a design which has no potential for deadlocks, specific sequences of requests can occur which cause one or more requestors to be locked out.
A prior art approach to detecting and preventing deadlocks between requests is described in U.S. Pat. No. 5,016,167 (issued to Nguyen et al. on May 14, 1991). Their method of detecting deadlocks is to have each CP requester count the number of times its request is rejected due to unavailability of resources (main memory interleaves). If a requestor exceeds a certain number of rejects, it generates an "inhibit" signal, which is used to block other CP requests from getting priority.
In our invention, the deadlock detection is located in the storage controller, a centralized piece of logic. This allows the detection of potential deadlock situations among all types of requestors. It will resolve lockout scenarios among requesters of the same relative priority or those of different priorities. Additionally, the blocking of other requestors is done within the SC, at the request register level. This avoids the complexity and delay involved in retrying an operation from a requester external to the SC. A key difference is that our invention will detect a potential lockout of a requestor which hasn't been granted priority and hasn't made a pipeline pass, since it is based on the amount of time the request has been valid in the SC without completing, rather than on the number of times the request has started to execute.
In our invention, the blocking of a requester is conditional based on whether that requester has already started to execute. If so, it will not be affected by the deadlock detection logic. This avoids the possibility of blocking an operation which may have to complete before the locked-out operation can complete.
Our invention also makes use of an internal pulse, which has been designed to occur at the correct interval to resolve potential deadlocks before a storage controller hang is detected, instead of a counter. The pulse is received by specialized logic stations, which allow programmability for the interval used to detect deadlocks, and provide the ability to disable the deadlock resolution function for specific categories of requests.