Generally speaking, modern computers, particularly mainframe systems, are formed of a main storage, one or more central processing units (CPUs), operator facilities, a channel sub-system which includes an input/output processor (IOP) and various input/output (I/O) devices. The I/O devices are typified by direct access storage devices, tape drives, keyboards, printers, displays, and communications controllers and adapters.
Of particular relevance here, the channel sub-system controls and directs the flow of information between the main storage and typically each I/O device. Such a sub-system relieves each CPU in the computer system of a need to communicate directly with each I/O device, thereby permitting data processing to proceed concurrently with I/O processing. This increases the throughput of the entire computer.
The channel sub-system uses one or more so-called channel paths as a communications link to transfer and manage the flow of information to and from the I/O devices. As part of I/O processing, the channel sub-system tests for available channel paths, selecting an available path to employ in connection with a particular I/O device then to be used, and initiates the execution of an I/O operation over that path and through the device. The channel sub-system contains sub-channels, each of which is associated with one or more channel paths. One sub-channel is typically provided for and dedicated to each I/O device that is accessible through the channel sub-system. Each sub-channel stores information concerning the associated I/O device and its particular attachment to the channel sub-system. Each sub-channel also stores information concerning I/O operations and other functions involving its associated I/O device. Any of the information can be accessed either by the CPU(s) in the computer system through the use of I/O instructions or by the channel sub-system itself and serves to provide communication, with respect to the associated I/O device, between any such CPU and the channel sub-system. The actual number of channels that is provided in any computer system can vary widely and is based on the configuration of that system, i.e., the specific architecture of the system without regard to the I/O devices.
Each I/O device is attached, through an associated control unit, to the channel sub-system via a channel path. Each such control unit may be attached to more than one channel path; an I/O device may be attached to more than one control unit. As such, a particular I/O device may be accessible to the channel sub-system over a number of different channel paths, with this number based on the configuration of the overall computer. For additional information on the channel sub-system and its functions, the reader is illustratively referred to Chapter 2, "Organization" of Enterprise Architecture/390: Principles of Operation, Publication Number SA22-7201-04, Fifth Edition, June 1997, (copyright 1997 International Business Machines Corporation), which, for simplicity will be hereinafter referred to as the "ESA/390 Manual."
To use a channel, a CPU issues a so-called channel program, which consists of channel command words (CCW), for subsequent execution by the channel sub-system. For any sub-channel, each CCW specifies a command to be executed by the IOP over that sub-channel. For commands that initiate certain I/O operations, each of the associated CCWs designates an area in main storage that is to be utilized with each of these operations and an action that will be taken whenever a transfer to or from that area is completed, as well as other options. A channel program consists of one or more CCWs that are logically linked such that all of these CCWs are fetched by the channel sub-system and executed in the specific sequence specified by a CPU program.
Contiguous CCWs are linked by the use of chain-data or chain-command flags, and non-contiguous CCWs may be linked by a CCW specifying a "transfer-in-channel" (TIC) command. A CCW becomes current when: (a) it is the first CCW of a channel program and has been fetched, (b) during command chaining, that CCW is logically fetched, or (c) during data chaining, that CCW takes over control of an I/O operation. Many I/O devices expect channel programs to end with a certain CCW or sequence of CCWs. The most common ending for a channel program is a no-operation (NOP) CCW whose command chaining flag is off, indicating that it is the last CCW. If a NOP CCW's command chaining flag is on, channel program execution continues with the next contiguous CCW in memory.
During CPU program execution on certain computer systems, such as those that employ ESA/370 or series 9000 architecture, channel programs may be dynamically extended by a CPU in order to undertake additional I/O operations, as required by the CPU program. This advantageously permits further information to be transferred between main storage and an I/O device then in use without a need to restart the sub-channel each time. In that regard, the CPU program will illustratively build an "initial" channel program, i.e., containing an initial CCW, to transfer an initial record from main storage onto a network adapter known as Common Link Access to Workstation (CLAW). Once this particular operation is underway and the CPU program has executed further, the CPU program may require more records to be transferred to or from the CLAW adapter. To send subsequent records, the CPU will likely append, through a well-known technique called "command chaining," one or more additional CCWs to the channel program in order to transfer the second record, and so forth for each successive record. This enables the sending of the records without the necessity of the CPU, specifically an I/O supervisor within the operating system, having to repetitively issue a separate start sub-channel (SSCH) command for each of these records and re-establish the associated channel path, thereby saving channel execution time and providing increased channel throughput.
Hence, through command chaining, the last CCW in a channel program executing at the time will be modified to point to the next successive CCW, and so forth in order to chain all the successive CCWs together into a single channel program for the associated I/O device. Furthermore, as part of the process that modifies the channel program and to conserve storage, the CPU may also release locations in the main storage associated with newly used and now obsolete segments of this channel program.
Command chaining is particularly effective when the CPU can outrun the I/O Processor (IOP). When the IOP completes a channel program, it issues an interrupt to the CPU program, causing overhead to the CPU program and requiring that the IOP be restarted to perform a subsequent I/O operation. If the CPU can continuously extend an executing channel program such that the IOP seldom or never reaches the end of a channel program, interrupts and restarts are greatly reduced.
The CPU and the IOP operate independently of each other with no synchronization between themselves, except for interrupts that the IOP may issue to the CPU. This can cause problems when a CPU attempts to extend running channel program. FIGS. 1A and 1B depict a simple illustration of a channel program to be extended. Initially, the channel program resides in Buffer A 100, and consists of a Write CCW 101 chained to a TIC CCW 103 which points to a NOP CCW 104. A NOP CCW which is not chained to another command ends a channel program. As this channel program is executing, the CPU builds additional CCWs in Buffer B 115. This new channel program is built similarly to the one in Buffer A, and when it is built, the CPU will modify the TIC 103 in Buffer A 100 to point to the beginning of the additional CCWs 109. Three results can occur from this operation. If the CPU does not complete this process before the IOP fetches the TIC of the initial channel program 103, the IOP will fetch the terminating NOP CCW 104 and end its operation, causing it to issue an interrupt to the CPU, and the CPU will have to restart the IOP to cause the CCWs in Buffer B to be executed. However, if the CPU makes the change in the TIC 103 before the IOP reaches that point, the channel program will fetch the first CCW of the additional CCWs 109 and continue executing the new instructions without interruption, which is the desired result.
The third, undesirable result is that the IOP is reading the TIC CCW at the same time that the CPU is modifying its address field 261 to point to the additional CCWs. This may result in the IOP reading an invalid address out of the TIC, as the word containing the address is in an indeterminate state while it is being updated by the CPU. It may contain the old address, the new address, or, most undesirably, some random combination of bytes from each address. This random combination result would cause the IOP to attempt to execute random storage, which may not contain CCWs, causing the IOP to program-check, resulting in an undesirable interruption of the CPU and termination of the I/O processing. While this result may seem rare, in today's high-performance computer systems in which thousands of I/O operations may be performed per second, it can happen unacceptably often.
A second prior art method of extending channel programs which attempts to reduce the window of error is illustrated in FIGS. 2A and 2B. In this example, the CPU builds an initial channel program in Buffer A 200 ending with two NOP CCWs, with the first NOP CCW 206 command chained to the second one 207. When the channel program needs to be extended, the CPU writes the additional CCWs into Buffer B 202. The additional CCWs end with a structure (215,216) similar to the initial channel program (206,207), so that they may be extended in a similar manner if necessary. After the additional CCWs are completed, the CPU modifies the second NOP CCW in the original channel program in Buffer A to change it into a TIC CCW 208, which transfers control to the first of the subsequent CCWs 213.
This second method improves on the first method in two ways. First, this second method reduces the probability that the IOP will complete processing before the CPU can update the CCW to cause command chaining to occur, as the final CCW is being updated instead of the penultimate one as is done in the first method.
Second, the window in which the IOP may read an invalid CCW is reduced. FIG. 2C shows illustrative CCW formats. A generic CCW format 250, is shown with the format of a NOP 251 and a TIC 252 CCW. For these formats, if the process of converting the NOP CCW into a TIC CCW is performed in two steps, the invalid address window is eliminated and replaced with a smaller invalid CCW window. First the CPU replaces the empty address field 258 in the NOP CCW with the address 261 of the beginning of the additional CCWs. Since the command code 253 has not yet been changed from NOP to TIC, this address will be ignored by the IOP if the IOP reaches this CCW while the CPU is modifying the address. After the address modification is complete, the CPU modifies the flags field 254 to the values 260 appropriate for the TIC CCW being built. Finally, the CPU modifies the command code 253 to indicate that the CCW is a TIC 259. There still exists a window in which the IOP may read the CCW with invalid contents if the IOP reads the CCW as the CPU is updating these fields. If the command code has not yet been changed from NOP to TIC, then invalid contents in the flags field will not cause a problem, as this field is ignored. If the command code is in the process of being modified, the IOP may read an invalid command code and program-check. However, as the command code field contains fewer bytes than the address field, this window is smaller than the window in which the address may be updated using the first method, so the probability of error, while still existent, is smaller. On machines that employ IBM's ES/9000 or 390 architecture, the address field would contain four bytes and the command code field would contain one byte, so the invalid CCW window is reduced to one-fourth its former size. However, this probability for error is still unacceptably high, as is the probability that the IOP will complete processing before the channel program is extended, causing undesirable interrupts.