The present invention relates generally to the management of shared devices in a data processing system.
In data processing systems, device adapters are used to control attached devices for use by a host computer. Such device adapters may be incorporated into the host computer (internal) or embodied as external device adapters. An example of a device adapter incorporated into a host computer is a device adapter board that may have associated device driver software. An example of a device adapter embodied as an external device adapter is an external device controller unit that is connected by a communication link to the host computer.
Device adapters are typically used to control storage and other peripheral devices, to interpret requests from the computer into a form suitable for use by the devices, and to interpret responses from the devices into a form suitable for use by the computer. Typically, devices such as disk are connected to a single adapter and are used by a single host computer. More recently, multi-adapter environments have been developed in which devices are connected to multiple adapters that, in the case of internal adapters, may exist in one or more host computers. In one known arrangement, one of the device adapters is nominated as a master adapter and all I/O operations to the devices are coordinated by the master adapter. This arrangement has the disadvantage of high workload at the master adapter. An alternative arrangement may therefore be used in ,which the multiple adapters are able to independently initiate I/O operations to the shared devices. For some types of I/O operations, for example, RAID 5 writes, this second arrangement requires the use of a locking technique to prevent corruption of data on the shared device.
In a multi-adapter environment where each adapter can independently initiate I/O operations to the shared devices, the adapters may place differing workloads upon the devices and knowledge regarding the sizes of the workloads may not be shared between the adapters. In this case, when one adapter sends work to a device, it has no knowledge of the amount of work other adapters may also have sent to the device. Hence it can be difficult to predict how long it will be until the work is actually dequeued by thee device and executed. Consequently, in this environment, it is difficult to handle timeouts effectively. If an adapter does not receive timely notification from a device that the work has been completed, it may assume that the device is dead and initiate a system reset. This assumption may however be erroneous as the delay may be due for example to the device processing work from another adapter. Thus command time-out values are difficult to set accurately. One solution to this problem would be to set long time-outs. However, this would delay recovery to an unacceptable degree when there is a genuine malfunction.
Another solution to this problem might involve each adapter being informed by the other adapters of all the commands they have sent to the shared device. Each adapter could use this information to build up an overall picture of activity at the shared device and set the timeouts accordingly. However such a solution will generally not be practicable due to the unacceptable overhead caused by the inter-adapter communication.
It is therefore an object of the invention to provide a time-out management technique which overcomes the disadvantages of the prior art.
According to one aspect of the present invention therefore, there is provided a method for managing devices in a data processing system that includes a plurality of device adapters connected for independent communication with at least one shared device. The method comprises the steps of: issuing a command from a first of the plurality of adapters to the at least one shared device; setting, in the first adapter, first and second timeouts associated with the command; on expiration of the first timeout, issuing a message from said first adapter to other(s) of the plurality of adapters to request the other adapter(s) to notify the first adapter of any work requested of the shared device by the other adapter(s); and on expiration of the second timeout, initiating a recovery operation in the data processing system.
In this way, the problem of erroneous resets described above is avoided by the use of first and second timeouts. On expiry of a first timeout, the system enters a Device Suspicious mode and a message is broadcast to the other adapters connected the shared device to cause each of the other adapter(s) to provide a notification of any work that it (they) initiated at the shared device. If the second timeout expires at the first adapter before receiving any such notification then the first adapter assumes that the device is dead and initiates a recovery action. As will be described below in relation to the preferred embodiment, if notification is received before expiry of the second timeout then the second timeout value is extended. The second timeout may be extended multiple times in response to multiple completion notifications. However an upper limit to the second timeout value will generally be employed in order not to significantly delay a device reset in the event of a device malfunction. As an alternative to extending the second timeout, a third timeout could be set on receipt of each notification.
In the preferred embodiment, the first and second timeouts are set on issuance of the command by the first adapter to the shared device. In an alternative arrangement, the second timeout could be set on expiry of the first timeout.
In the preferred embodiment, the message from the first adapter requests the other adapter(s) to notify the first adapter of the completion of every command that the other adapter(s) have outstanding at the shared device. Thus when the first adapter receives notification from another adapter, it knows that the shared device is still completing work for the other adapter and can therefore extend the second time value in the expectation that the device will subsequently execute the command issued by the first adapter.
In accordance with the preferred embodiment, when the system enters the Device Suspicious mode, the message sent from the first adapter to the other adapter(s) also requests the other adapter(s) to halt sending work to the shared device. The first adapter itself also halts sending work to the shared device. On receipt of the message at each of the other adapter(s), the adapter(s) each issue a Ordered Test Unit Ready command to the shared device. This command joins the command queue at the shared device awaiting execution. When the Ordered Test Unit Ready command reaches the head of the queue, it is executed by the shared device and a completion message is sent to the issuing adapter. Receipt the completion message therefore provides an indication as to when all the outstanding work initiated by the adapter has been completed. At this point, the adapter can restart sending work to the device. Use of the Ordered Test Unit Ready command provides an additional benefit: if the adapter believes that it has work outstanding when it receives a completion message, then it knows that there is a problem with the device itself, with the adapter or with the connecting link. In this case, an appropriate system reset is performed.
In accordance with the preferred embodiment, the values of the first and second timeouts set by the first adapter will vary according to the type of command issued by the first adapter. Thus for a command which normally takes only a few seconds to complete, for example a Disk read command, the first and second timeouts will be relatively short. For a command that normally takes many seconds to complete, for example a device microcode download command, the first and second timeouts will be relatively large.
A preferred feature of the present invention provides additional benefits in the case where the shared device is executing a command from a second adapter that takes a prolonged period to complete, for example a device microcode download. In this scenario, if the first adapter issues a command to the shared device during this prolonged period, then it is possible that the first and second timeouts at the first adapter will expire before the second adapter can provide a notification that its command is complete. To overcome this, when a command is issued from one of the plurality of adapters to the shared device that will cause an operation that takes longer than a predetermined value, the issuing adapter periodically sends false completion notifications to each of the other adapter(s) to prevent expiry of any second timeout value(s) at the other adapter(s).
Viewed from another aspect, the present invention provides a method and apparatus for managing devices in a data processing system that includes a plurality of controllers connected for independent communication with at least one shared device, the at least one shared device maintaining a set, for example a command queue, of commands issued to it by the controllers. In response to a trigger event at a first of the controllers, the first controller sends a request to the other controller(s) to request the other controllers) to provide information regarding work requested of the shared device by the other controllers). The requesting controller employs the received information to build up a picture of the state of the command set at the shared device. This trigger event may be a notification from the shared device that the command queue is full in which case it is advantageous for the controller to acquire information regarding the commands on the queue. The trigger event may instead be the expiration of a timeout (corresponding to the first timeout of the first aspect of the invention) that was set on issuance of an I/O command from the first controller to the shared device.
A preferred embodiment of the invention will now be described, by way of example, with reference to the accompanying drawings.