1. Field of the Invention
The invention relates generally to managing I/O in data processing systems. More particularly, the invention relates to dynamically managing I/O connectivity from the host processor level in a mainframe computer system.
2. Description of the Related Art
Prior art systems for communicating between one or more host operating systems, running on all or part of a central processing complex (CPC), and a set of peripheral devices via a channel subsystem (CSS) are well known. The term "device", as used herein, is meant to include such components, whether directly addressable or not, as control units, peripheral cache memories, communications apparatus, data storage units such as direct access storage devices (DASD), tape recorders and the like.
The aforementioned systems typically use one or more control units (CUs) to control data transfer on the paths between a given CPC and a particular peripheral device. Various I/O management schemes have been developed for use in these systems to deal with I/O subsystem maintenance and reconfiguration. Users can enter maintenance and reconfiguration requests at a device CU console. This type of management scheme is referred to hereinafter as a control unit-based I/O management scheme.
An example of a data processing system that utilizes control unit-based I/O management techniques is described in copending patent application Ser. No. 251,969, filed Sep. 26, 1988, assigned to the same assignee as the present invention. Patent application Ser. No. 251,969 is hereby incorporated by reference.
Methods and apparatus are described in the referenced application for automatically reconfiguring a data processing system during operation such that devices can be removed from the system during maintenance with the system automatically reconfiguring itself to continue operations.
According to the teachings in the referenced application, a manually presented definition of the various paths between each CPC and each device is entered into the host system and channel subsystem when the data processing system is initialized. Configuration tables, maintained at both the host and channel subsystem level, contain data defining and identifying the channels, switches (if used), CUs and various devices. The relationship of these elements, contained in these tables, effectively define all the I/O paths between each CPC and each peripheral device.
Sometime after initialization a registration process takes place wherein each host sends information to each device CU informing the CU that it (the CPC) is a device user. It should be noted, for later reference and comparison with the present invention, that the device CU in the referenced application is not cognizant of any other devices coupled to a particular CPC.
After the registration process is complete and before any device is taken off line, the CPCs to be affected need to be notified, by the particular device CU(s) involved, of pending configuration changes. In effect, the referenced application uses a "bottom up" (from a system hierarchy point of view) notification scheme for CUs to notify CPCs of quiesce requests via the paths between a device and any affected channels.
It should be also noted that the quiescing scheme taught in the referenced application is host driven, i.e., a device CU waits for registration information from the CPCs to which it is ultimately attached. If the registration information is never supplied, e.g., if the device starts offline and will change to online (or fails), the CU does not know of any attachment to the host. Such a failure would prevent the scheme taught in the referenced application from working properly since the computer I/O configuration must be identified to the hardware and software that will use it.
Often, the identification of a system's I/O configuration is complex. That is, the definition of it is difficult and requires a lot of advanced planning. The difficulty in definition is largely due to the complexity of I/O in large installations where there are multiple processors and applications that need to concurrently share or potentially access I/O resources. In defining the I/O configuration many factors must be taken into account. Logical constraints, such as which I/O units must be accessed, the speed of the I/O units, the overhead of shared I/O protocols and interlocks, and the number of logical sessions an I/O unit can support are examples of such factors. Also, physical system constraints must be considered, such as distance from the processor, accessibility (by cables or maintenance personnel), and weight distribution of the I/O units on the floor.
Since correctly defining the I/O configuration takes such effort, other products that require an I/O definition compound that complexity. There may be only one, primary, I/O definition. That definition must be altered when either the physical configuration, or the logical requirements on the I/O are changed. If there are other, secondary, I/O definitions for use by other products which must be updated in coordination with the primary, then the task of altering the I/O definition requires more effort, is far more prone to error, and requires more planning for a coordinated execution of the change.
Accordingly, it would be desirable to be able to query the channel subsystem from the host level of a data processing system, to dynamically determine the entire I/O connectivity attached to a processor. Such information would enable the host to create a host view (top-down view) of all paths from the host o an attached set of devices. In addition to providing a centralized view of the particular CUs attached to a given processor, a map of neighboring devices under the influence of the given processor would automatically be at hand. Such information would be useful for other I/O management tasks, such as dynamically managing connectivity, analyzing data traffic, scheduling jobs for I/O recovery, etc.
Furthermore, the ability to dynamically create an I/O connectivity database (i.e., to create a current correct system I/O configuration without additional user intervention) at the host level would eliminate the problems that would occur in a host driven registration system if a particular CU should fail to be informed of its attachment to the host. The host could automatically create the latest I/O map in real time.
As data processing needs of system users grow, the number of peripheral devices connected to and supported by data processing systems also grows. Multiple data processing applications requiring a plurality of various peripheral devices increase systemwide connectivity requirements. As a result the number of connections (and ultimately paths) to be identified, remembered and managed increases. The ability of each CU to store and process all the required data to notify multiple host processors possibly affected by configuration changes, etc., is more limited in terms of resources than the ability of each host processor to deal with I/O management.
Accordingly, it would also be desirable if instead of a device control unit-based I/O management scheme, a centralized host-based I/O management scheme could be devised to dynamically manage connectivity from any host. This would eliminate having to enter I/O management requests at the CU level particularly when the devices become numerous and widely distributed. Furthermore, a centralized host-based management scheme would eliminate the need to coordinate operator action at a system console with actions of maintenance personnel at the control units or devices.
Adding still further to the complexity of managing I/O in present day computer systems is the use of switches in the data paths between the channel subsystem and peripheral devices. The use of switches further increase device connectivity capability and flexibility by increasing the number of logical available connections while at the same time reducing the number of physical connections required. However, with this increased capability and flexibility the task of I/O management increases as the number of devices that can be connected to a CPC goes up and the number of CPCs that can be connected to a device increases.
Furthermore, space and other physical plant limitations often dictate that the peripheral devices be located further and further away from the host computing resources, making a centralized I/O management scheme even more important. Whereas prior art data processing systems needed to keep peripherals within a range of approximately 400 feet from the CPC on account of constraints related to the length of connecting electrical cables; the use of state of the art fiber optic data links has extended the range of where peripherals can be located to over a mile from the CPC.
All of the factors stated hereinabove make it even more desirable, if not essential, to be able to initiate and control I/O connectivity from a central point, preferably at the host level in the data processing system.
Given the need and desirability of performing centralized dynamic I/O connectivity management, new problems need to be addressed before the centralized management function can be performed with integrity.
In a computer I/O configuration where connectivity options are increased using switches (or even nested layers of switches), switchable I/O resources may be inadvertently removed from physical connectivity with a system or program which requires them. Such a loss of I/O resources may cause the program or system to lose its data and/or functional integrity, causing it to fail.
There are no known switching products that integrate physical switching operations with the systems' logical view of the I/O connectivity. Due to larger I/O configurations with more shared devices, more complex systems, and more automated operations environments, the manual effort required by existing switching systems to be coordinated with systems operations is more intensive and less effective. Current switching systems do not provide the ability to protect systems from accidental outages. There is a need for switching systems to provide for greater integration of switching functions within the systems where they operate in order to reduce error-prone, manual and/or redundant efforts.
System Integrated Switching, as the term is used hereinafter, is a means by which logical availability changes can be made in order to reflect physical connectivity changes. A path is logically available as long as the operating system, the subsystem that controls I/O, or other program indicates that when performing I/O requests to a device, the specific path to the device may be used for that I/O. A path is physically connected as long as there are means to perform the I/O operation.
It would be desirable if logical availability changes were made in such a way that a system could preclude the physical change when that system would be adversely affected by the physical change. Roughly, a component of the system (e.g., any host processor in a multiprocessor environment) can state, "No, don't make this change, it will remove something I need." With System Integrated Switching, a computer system complex would have the ability to maintain its data and/or functional integrity by prohibiting physical changes to its required I/O paths.
In addition, such a system could make use of resources as soon as they are physically connected without additional operator effort since physical I/O connectivity is automatically reflected in the system's view of logical availability.
A centrally operated I/O connectivity management system that includes both a System Integrated Switching capability and the ability to dynamically create an I/O connectivity database is presently unknown.
In order to implement such a system, it is necessary to be able to provide direct host access to switches (and their associated switch controllers, which is the hardware component that controls the state of the switch) that, according to the prior art, are "transparent" (as defined hereinafter) to normal system operation.
Dynamic switches are defined herein as switches which operate by making connections when they are needed, and which break connections when they are no longer needed. Connect and disconnect delimiters are utilized to operate such switches in a manner that is transparent to programming. Techniques for operating switches in this manner are described in a copending Patent Application, entitled "SWITCH AND ITS PROTOCOL FOR MAKING DYNAMIC CONNECTIONS", (IBM docket number P09-88-011), filed Oct. 30, 1989, in the name of P. J. Brown, et al, hereby incorporated by reference.
In commercially available computer systems, host processors (and if more than one operating system on a processor, each operating system) can "see" switches as they "see" other devices (e.g., disks, etc.), however the hosts are not cognizant of switches as switches, nor are the hosts cognizant of which switches lie in any given path to another device. Hence the switches are "transparent" to a host.
In known systems, a host can communicate with a switch controller on a separate link not going through the switch. It is also known that a host can communicate with a switch controller indirectly via the switch and a control unit located outside the switch. This lack of direct access to the switch controller (via the switch itself) limits the host's ability to control and manage the switch as an I/O path component. This is particularly true in a multiprocessor environment where coherency needs to be maintained across the system when, for example, a switch (or several of its ports) is taken out of service.
In order to implement a centralized dynamic I/O connectivity management system, it is necessary for the host processors to be cognizant of the switches as switches, to know the paths in which the switches lie, and to have the aforementioned direct access to switch controllers.
Means for identifying the existence of switches coupled to a channel, means for identifying the address of where channels are attached to switches, means for querying the channel subsystem to collect switch existence and address information, are used in this application, and described in detail hereinbelow.
It would be desirable to be able to provide for the aforesaid direct host access capability, in combination with the various means to make hosts cognizant of switches as switches, etc., to be able to effect the type of centralized control, management and coherency required to implement a dynamic I/O connectivity manager. Such a combination would facilitate remote control of switch functions, would enable the recording of switch error status at the host level, facilitate switch recovery, identify resources in an I/O configuration, and provide a means for interprocessor communications via each switch control unit/device that is coupled to more than one processor. Accordingly, it would be desirable to be able to provide the aforementioned switch access feature in a centrally operated I/O connectivity management system.
Finally, a user of a distributed application (i.e., an application that has peers running on separate computers, e.g., the dynamic I/O connectivity management system contemplated herein) needs to be assured that a command or set of commands are performed to completion before another user can issue a command or set of commands.
The application needs to let only one user access the application at any single point in time, and reject all other users until the first user has completed its task (of one or more commands). Rather than a manual, procedural method to communicate between multiple users to keep their efforts synchronized, it would be desirable if the application could assume the overhead and responsibility of assuring that the state of the application environment is controlled by only one user at a time. This feature (referred to hereinafter as a Floating Master Interlock) would enable centralized dynamic I/O connectivity management systems to run concurrently on a plurality of host processors and insure the integrity of critical data, for example, data stored at each of the aforementioned dynamic switch control units, etc.