The input/output (I/O) devices of a computer system often communicate with a central processing unit (CPU) and system memory (e.g., random-access memory (RAM)) by way of a chipset. The chipset can include a memory controller and an I/O controller. Various peripheral devices are connected to the CPU by way of various buses, such as the peripheral component interconnect express bus (hereinafter PCI-E or PCI-E bus). The PCI-E bus uses high-speed serial signaling and enables point-to-point communication between devices. Communications along a PCI-E connection are made by way of packets and a message signal interrupt scheme.
The PCI-E specification calls for a reset mechanism that is in-band in nature. Unlike a traditional reset mechanism that involves a dedicated signal line, the in-band reset mechanism involves a command primitive “DL_DOWN” that is used in lieu of a reset signal on the dedicated signal line to initiate a reset sequence. The DL_DOWN status indicates that there is no connection with another component on the bus, or the connection with the other component has been lost and is not recoverable by the Physical or Data Link Layer.
At least two problems have been identified in prior art data storage systems that use the PCI-E protocol. The first identified problem is very specific in nature and results in a cache data loss. The data loss occurs due to a particular implementation of a battery-backup module with systems that include MegaRAID® I/O controllers. The MegaRAID® family of RAID controllers are deployed in a wide variety of RAID data-storage solutions. MegaRAID® is the registered trademark of LSI Corporation, a Delaware corporation having a place of business at 1110 American Parkway Nebr., Allentown, Pa., U.S.A. 18109-9138 and the assignee of the present application.
“RAID” is an umbrella term for computer data-storage schemes that can divide and replicate data among multiple hard-disk drives. On certain computing platforms, a warm reboot or a power-on reset (such as that commonly invoked on some computing systems by simultaneously entering CTRL+ALT+DEL from an attached keyboard) generates a DL_DOWN interrupt instead of a PCI-E reset signal. MegaRAID® battery backup modules connected to the MegaRAID® I/O controllers use a “reset” or “power good” signal to mask the transition of a battery-backup enable signal during a reset routine. However, during a DL_DOWN condition the battery backup enable signal is not masked, which causes cache data loss.
A second problem results from the invocation of a “system” reset during a DL_DOWN condition. The “system” reset involves the reset of a PCI-E serializer/deserializer (SERDES) module. If the DL_DOWN condition is quickly followed by a DL_UP condition and if the PCI-E SERDES module has not completed its reset routine, then subsequent configuration information is not passed to the PCI-E core. In such cases, the MegaRAID® I/O controller is not recognized by the basic input/output system (BIOS) of the computer and as a result data in the data-storage system managed by the MegaRAID® I/O controller is rendered inaccessible.