1. Field of the Invention
The invention relates to networked computer systems. Specifically, the invention relates to apparatus, systems, and methods for facilitating port testing of a multi-port host adapter in a computer system.
2. Description of the Related Art
Computer and information technology continues to progress and grow in its capabilities and complexity. In particular, networking hardware and software has evolved from dedicated single port communication to multi-port communications between two networked devices. The additional ports provide higher throughput, more reliability, and failover protection in the event one of the ports fails or causes communication errors. Similarly, two networked computer systems may include multiple network interface cards, referred to herein as network host adapters to provide added reliability, throughput, and service in providing network communications.
FIG. 1 illustrates a system 100 suitable for implementing the present invention to facilitate port testing of host adapters and particular ports of multi-port adapters. The system 100 includes a host 102 connected to a storage subsystem 104 by a network 106 such as a Storage Area Network (SAN) 106. The host 102 communicates control commands and data, in the form of Input/Ouput (I/O) communications, to the storage subsystem 104. Hosts 102 are well known in the art and comprise any computer system configured to communicate control commands or data to the storage subsystem 104.
Similarly, the storage subsystem 104 is well known and comprises any computer system capable of responding to control commands and I/O communications from hosts 102. One example of a storage subsystem 104 suitable for use with the present invention is an IBM Enterprise Storage Server® available from International Business Machines Corporation (IBM) of Armonk, N.Y. The SAN 106 represents a network dedicated to communications relating to transfer and control of data between hosts 102 and storage subsystems 104. However, the present invention may be implemented with any network 106, a SAN being but one example.
Communications between the host 102 and storage subsystem 104 may be conducted using various common protocols and the hardware and software that support them. For example, the storage subsystem 104 may include host adapters 108 configured to support the Fibre Channel optical communication protocol. Of course, various other host adapters 108 may be used to support other protocols including, but not limited to, Internet Small Computer Interface (iSCSI), Fibre Channel over IP (FCIP), Enterprise Systems Connection (ESCON), InfiniBand, and Ethernet.
Generally, to provide enhanced reliability and enhanced performance throughout, the storage subsystem 104 includes a plurality of host adapters 108, one or more processors 110, an electronic memory device 112, and one or more electronic storage devices 114. The processors 110 process control commands and I/O communications. The electronic memory device 112 provides command and control storage for the processors. The electronic storage devices 114 provide persistent storage of data and may comprise storage arrays for increased data storage capacity and reliability.
Generally, it is desirable that the storage subsystem 104 provide reliable operations on a 24/7 schedule. Consequently, the natural errors and failures associated with the hardware and software of the storage subsystem 104 should provide a minimal disruption in normal operations. Unfortunately, repair, maintenance, and troubleshooting of errors in conventional storage subsystems 104 can introduce significant delays and severely impact performance of the subsystem 104.
In particular, significant time and productivity of the storage subsystem 104 can be lost in troubleshooting errors. Generally, troubleshooting includes running one or more test routines against different hardware and software modules to isolate the error. Typically, the storage subsystem 104 remains on-line and operational to service host requests during troubleshooting. Conventionally, each host adapter 108 is taken off-line, tested using the test routine and placed back on-line if the adapter 108 passes the test routine. In this manner, host adapters 108 not being tested can continue to service I/O communications.
Sequentially, applying the test routine to each adapter 108 in turn can take significant time, especially for test routines that require a technician to attach and remove test equipment. One example of such a test routine is a wrap test (also referred to as a loopback test or loop test). A wrap test tests to ensure that a transmitter and a receiver of the adapter is properly sending and receiving data through a particular port. Test data is transmitted out the port and then passed back into the same port by a piece of wrap test equipment, typically cable suitable for the communication hardware.
Conventionally, to perform a wrap test the whole host adapter 108 is taken off-line. This becomes problematic when the host adapter 108 includes multiple ports. Error free ports are needlessly taken off-line. Furthermore, using conventional test routines, the ports are tested one at a time in sequence. Typically, a technician attaches a wrap test cable to a port to be tested, initiates the port wrap test routine, waits for the test routine to complete, records the results, and then connects the wrap test cable to the next port. Any failed ports are identified. Testing of a single port may take as little as two minutes. However, all the ports are off-line until the last port is tested. Performing a wrap test on a single adapter may take as long as ten minutes. This delay can severely impact the performance of the storage subsystem 104.
Furthermore, the wrap test routine is designed for single port adapters. Consequently, the wrap test preserves the current state of the adapter 108, takes the adapter 108 off-line, and then restores the state to put the adapter 108 back on-line. This means that as multiple ports are tested on a single multi-port host adapter 108 the delay to maintain state information is multiplied by the number of ports tested. Serial testing of ports on a multi-port host adapter 108 significantly increases the testing time and down time of the adapter.
In addition, using conventional port test routines, such as a wrap test, the controller or processor that initiates the wrap test waits for the wrap test to complete before reporting the results of the wrap test. This is problematic in a multi-port adapter 108 because time and resources (other non-tested ports) of the adapter 108 are wasted because the controller is tied up testing the one port. Similarly, because the test routine takes the whole adapter off-line, ports not involved in the test can not be used to continue to process I/O communications. The adapter is off-line and the controller or processor of the adapter is busy waiting for the single port test routine to complete.
From the foregoing discussion, it should be apparent that a need exists for an apparatus, system, and method for facilitating port testing of a multi-port host adapter in a computer system. Beneficially, such an apparatus, system, and method would take just the port being tested off-line and allow ports not involved in a port test routine to continue processing I/O communications. In addition, the apparatus, system, and method would execute a port test routine simultaneously on two or more ports of a multi-port adapter. Furthermore, the apparatus, system, and method would configure a multi-port adapter once to perform one or more port test routines and then reconfigure the multi-port once to restore normal I/O communication processing.