As is known in the art, large host computers and servers (collectively referred to herein as “host computer/servers”) require large capacity data storage systems. These large computer/servers generally include data processors, which perform many operations on data introduced to the host computer/server through peripherals including the data storage system. The results of these operations are output to peripherals, including the storage system.
One type of data storage system is a magnetic disk storage system having a bank of disk drives. The bank of disk drives and the host computer/server are coupled together through a system interface, sometimes referred to as a storage array. The interface includes CPU modules and operates the storage processors in such a way that they are transparent to the host computer/server. That is, user data is stored in, and retrieved from, the bank of disk drives in such a way that the host computer/server merely thinks it is operating with its own local disk drive. One such system is described in U.S. Pat. No. 5,206,939, entitled “System and Method for Disk Mapping and Data Retrieval”, inventors Moshe Yanai, Natan Vishlitzky, Bruno Alterescu and Daniel Castel, issued Apr. 27, 1993, and assigned to the same assignee as the present invention.
One such system is shown in FIG. 1. Here the interface includes a pair of redundant CPU modules (CPU A and CPU B) interconnected through midplane by a PCI Express bus. Each of the pair CPU modules is coupled to both the host computer/server and the bank of disk drives (Disk Storage). This connection may be Fiber Channel, SAS, Ethernet, or any existing or future IO protocol. Only one of these connections from the Host/Server is active during any user data transfer operation, i.e., a so-called I/O transfer. The other connection is there for failover should the active connection fail.
More particularly, in this configuration, each one of the CPU modules includes an IO controller, a PCI-Express Switch, a Processor Complex, a microcontroller, and an OR gate arranged as shown. The OR gate is used to reset the IO controller and the PCI-Express switch in the event of either a system reset signal produced by the processor complex or by the microcontroller. The microcontroller also produces a reset signal for the processor complex.
Assume, for example, that the connection from the host computer/server to a port of one of a plurality of Input/Output (IO) controller units (not shown) within the IO controller in CPU module A is the active link (i.e., the link performing the IO transfer between the host computer/server and the disk storage. User data being to the Disk Storage first comes with, in this configuration, a PCI-Express protocol, to the IO Controller in CPU A. The IO Controller in CPU A converts the IO Protocol to a PCI-Express protocol and forwards the IO transfer data to the PCI-Express Switch within the CPU A. The PCI-Express Switch within the CPU A then routes the data to the CPU Complex within the CPU A where some data processing is performed and the processed user data, after the appropriate checksum is applied, is pushed from the CPU Complex within the CPU A back to the PCI-Express Switch within the CPU A. The data is then routed back through either the same IO Controller unit or potentially another IO Controller unit within the CPU A depending on where the user data is to be stored within the data storage. The user data then leaves a separate port from the IO Controller of CPU A and is written to the Disk Storage.
With such an arrangement, during a software upgrade of CPU Module A, a system reset signal is produced by the processor complex thereby placing the entire CPU Module A in an offline condition and may reset several times. The PCI-Express Switch advises the CPU Module B of this offline condition. It is noted that when the Processor Complex resets, all of the attached PCI-Express Devices reset as well. This allows the Processor Complex to configure everything correctly during the boot process. In this configuration, the Microcontroller is able to reset the Processor Complex and associated PCI Express devices. This Microcontroller monitors voltages and other status and will reset the CPU Complex to prevent damage to module PC board.
During this software upgrade of CPU Module A, all data that was being handled by CPU Module A will failover to CPU Module B and CPU module B will handle user data transfers during the software upgrade of CPU module A. Because of this failover the host computer/server must use a different path (i.e., different IO Port) to access the Disk Storage because the port it was previously using (i.e., the CPU module A port) is offline during the software update to CPU module A. Having the Host computer/server failover when the system is working as expected is not desirable.