The present invention relates to a storage subsystem, an I/O interface control method, and an information processing system.
A large-scale information system (mainframe) used in bank online systems and the like comprises a central processing unit and a peripheral storage unit. The peripheral storage unit, which comprises a storage control unit and storage units, is called a storage subsystem. Hereinafter, a brief description will be made of an interface between the storage subsystem and the mainframe.
Between the central processing unit and the storage control unit which make up the storage subsystem oriented to the mainframe, the following information is transmitted for each I/O request: (1) command, (2) command response, (3) command response acceptance, (4) data, (5) status, and the like. These are transmitted in the form of frame to perform I/O request processing.
To execute an I/O request to a storage unit, the central processing unit creates a command group consisting of plural commands and data called a CCW chain. The central processing unit issues the first command of the command group to the storage control unit. Upon receiving the command, the storage control unit sends a command response frame to the central processing unit to indicate that a command frame has been received. In response to the command response frame, the central processing unit sends a command response acceptance frame to the storage control unit. At this moment, the central processing unit and the storage control unit both recognize that data sending and receiving has become possible, and subsequently, data sending and receiving is started between the central processing unit and the storage control unit. When data on the issued command has been sent or received, a status frame is sent from the storage control unit to the central processing unit to indicate an end status of the data transfer processing.
After receiving the status frame from the storage control unit, the central processing unit checks the contents of the status, and issues the next command if next command processing can continue. In this way, one CCW chain is successively processed while taking interlock in terms of command, command response, data transfer, and status sending for each command between the central processing unit and the storage control unit.
A CCW chain will be described in some detail. Commands constituting the CCW chain include: a Define Extent command (hereinafter referred to as a DX command) that specifies the legality of access to records, access mode, and the like; a Locate Record command (hereinafter referred to as a LOC command) that provides information for locating pertinent input-output data in a cylinder, track, and record; and read/write commands for specifying actual reading and writing.
One CCW chain consists of a chain of these plural commands. Upon receiving a LOC command, the storage control unit recognizes a cylinder, track, and record to be located from parameter data of the LOC command, and performs location processing.
The LOC command is followed by and chained to read/write commands. Processing of the read/write commands chained to the LOC command is performed for contiguous records beginning in a record located by the LOC command. A group of read/write commands thus following and chained to the LOC command is referred to as a LOC domain. A LOC domain number, that is, the number of read/write commands chained to a LOC command is specified by a parameter of the LOC command.
In one CCW chain to execute an I/O request, if the next record to be processed is not contiguous to a record processed immediately before, processing cannot be performed by an identical LOC domain and the next record to be processed must be located. In this case, the next record to be processed is located again by a LOC command. In this way, in processing for one CCW chain, when there is a read/write request for several discontinuous records, plural LOC domains will exist in the CCW chain.
Next, a description will be made of the operation of disconnecting a logical connection between the central processing unit and the storage control unit during the above described CCW chain execution.
When a read/write command is issued from the central processing unit to a storage unit under control of the storage control unit, if processing target data does not exist in a cache memory within the storage control unit, the data must be staged to the cache memory from the storage unit. In this case, the storage control unit cannot immediately execute the command. Therefore, the storage control unit sends a status to temporarily disconnect a logical connection between the central processing unit and the storage control unit to the central processing unit, and disconnects the logical connection. Thereafter, the moment the staging to the cache memory within the storage control unit is completed and preparations for I/O processing are complete, the storage control unit sends a connection interrupt request to the central processing unit to make a logical connection, and then makes a status report to indicate the resumption of I/O processing.
As described above, the storage control unit may, in some cases, disconnect a logical connection with the central processing unit because preparations for I/O processing are incomplete. Such disconnection factors include the following cases: (1) data does not exist on the cache memory within the storage control unit, so that the data is staged to the cache memory within the storage control unit from a storage unit; (2) a space cannot be allocated to the cache memory within the storage control unit, so that it is awaited that a free space occurs in the cache memory; and (3) a resource for I/O processing cannot be acquired because it is busy, so that it is awaited that the resource is released from the busy condition.
If such a disconnection operation frequently occurs during execution of an CCW chain, a total response time of I/O request processing will increase.
Next, one example of a technology for reducing an increase in response time due to cache misses will be described.
One of I/O request patterns is sequential access to records within a storage unit for processing, as typified by high-volume batch processing. In this case, a CCW chain consists of a DX command and LOC command as described previously, and plural read or write commands chained to the LOC command, and is characterized by processing contiguous records and tracks. Since processing target records are contiguous, read/write commands can be processed continuously without having to switch between LOC domains.
As described previously, if read/write target records are cache misses, since a logical disconnection between the central processing unit and the storage control unit occurs, response time will increase. In the case of this sequential access, however, even if the next command of the CCW chain is not received, since the next cylinder, track, and record to access can be predicted, by in advance staging the data on a cylinder, track, and record that will be processed to the cache memory, the chances to make disconnection between the central processing unit and the storage control unit can be reduced and response time is expected to be improved. By the way, whether the CCW chain is sequential access or not can be determined by referring to information indicating sequential access in the DX command.
On the other hand, another I/O request pattern is access to random records, as typified by access to a database. In the case of random access, since records to be accessed are distributed, before processing each record, location processing must be performed by a LOC command. Consequently, plural LOC domains exist in a CCW chain of random access. Unlike sequential access, with random access, since processing target records are not contiguous, the next record to be accessed cannot be predicted, so that data to be accessed cannot be staged in advance as it can be during sequential access. Consequently, it can be said that random access may have more chances of logical disconnection from the central processing unit due to cache misses than sequential access.
The technology of reducing an increase in response time due to cache miss has been described above. Next, a technology on an increase in throughput will be described.
In recent years, a fiber channel protocol has been in the limelight as a protocol for achieving high-volume transfer, remote data transfer, and the like. Although the fiber channel protocol is a technology having been so far mainly used in open systems, there has been recently proposed FC-SB2 (FIBRE CHANNEL Single-Byte Command Code Sets-2 Mapping Protocol), which is a protocol adhering to a physical layer (FC-PH) of fiber channel protocol as a mainframe fiber channel protocol. FC-SB2 is the result of mapping a conventional communication protocol between a mainframe and storage subsystem to FC-PH, and is currently being standardized by ANSI (American National Standard for Information Technology). The FC-SB2 has two major characteristics.
First, unlike conventional mainframe protocols, without occupying a logical connection path (hereinafter referred to as a logical path) between a central processing unit and a storage control unit during processing for one I/O request (one CCW chain), I/O requests for plural logical volumes can be executed at the same time on the identical logical path. Second, the central processing unit can issue commands and data in a pipeline fashion without taking interlock with the storage control unit. With the FC-SB2, for example, when a WR command is issued, even if a command response to the WR command is not sent from the storage control unit, the central processing unit can send data of the WR command to the storage control unit. Furthermore, even if a status frame to the command is not received, the central processing unit can issue the next command and data. Thus, the FC-SB2 protocol dictates that the central processing unit and the storage control unit respectively perform command processing asynchronously with each other.
The FC-SB2 protocol having the above described characteristics is a very effective protocol in that system throughput is not reduced at the time of connection under a high load over a long distance.
In a central processing unit and a storage subsystem connected by a protocol such as the FC-SB2 protocol that achieves reduction in interlock between the central processing unit and a storage control unit, reduction in throughput can be suppressed at the time of connection under a high load over a long distance. However, as with other protocols, the storage control unit successively processes received commands in the order they were received. Therefore, when processing target data results in a cache miss in the middle of command processing, data transfer processing between the central processing unit and the storage control unit cannot be performed, as described previously, for the duration of the staging of required data to a cache memory. This results in an increase in response time. Particularly, in random access processing as typified by database access, since a preread staging operation during sequential access cannot be performed, cache misses may increase. When a cache miss occurs in a command, since a retry operation is performed for that command and following commands, commands and data received from the central processing unit after the command in which the cache miss occurs are temporarily discarded and must be received again from the central processing unit. The re-receipt of command and data provides great overhead during connection over a long distance.