A storage array is a data storage system that includes multiple disk drives or similar persistent storage units. A storage array allows large amounts of data to be stored in an efficient manner. A storage array also provides redundancy to promote reliability, as in the case of a Redundant Array of Inexpensive Disks (RAID) system. The phrase “RAID” is generally used to describe computer data storage schemes that divide and replicate data among multiple physical hard disk drives (PDs). In general, RAID systems simultaneously use two or more PDs to achieve greater levels of performance, reliability and/or larger data volume sizes. One or more PDs are setup as a RAID virtual disk drive (VD). In the VD, data is typically distributed across multiple PDs, but the VD is seen by the user and by the operating system of the host computer system (e.g., a server) as a single disk. Storage space in the VD maps to the physical storage space in the PDs, but the VD usually does not itself represent a single physical storage device. Typically, a meta-data mapping table is used to translate an incoming VD identifier and address location into a PD identifier and address location, respectively.
A variety of different RAID system designs and RAID levels exist, all having two key design goals, namely: (1) to increase data reliability and (2) to increase input/output (I/O) performance. In RAID systems, I/O functions of the host computer system are expedited due to the fact that multiple PDs are capable of being accessed simultaneously. RAID systems improve data storage reliability and fault tolerance compared to single-drive computer systems because data lost as a result of a PD failure can be recovered by using the remaining data and parity stored in one or more other PDs to reconstruct the data that was stored on the failed PD.
RAID systems may be implemented in hardware or software. Software-based RAID systems utilize RAID software that provides an abstraction layer between the PDs and the VD. The RAID software is typically part of the operating system (OS) of the host computer system. The RAID software runs on the host CPU and the RAID data is carried on busses between the host PD controller and the host CPU. The host CPU has processing overhead associated with executing the RAID software, which can degrade performance of the host computer system. In addition, the RAID data carried on the busses between the host PD controller and the host CPU can create congestion, which can also degrade performance. However, in many cases, software-based RAID systems are suitable solutions, particularly in cases where lower RAID levels are being implemented.
Hardware-based RAID systems use a dedicated hardware RAID controller to perform many of the I/O tasks associated with the storage and retrieval of data in and from the RAID VD, respectively. The dedicated RAID hardware controller reduces the amount of RAID processing that needs to be performed by the host CPU and therefore improves performance by freeing up the host CPU to perform other tasks. In addition, because many of the I/O tasks are performed in the RAID hardware controller instead of in the host CPU, there is a reduction in the amount of RAID data that is carried on the busses between the host CPU and host PD controller, which also improves performance.
The major advantage of software-based RAID systems over hardware-based RAID systems is that software-based RAID systems only require a standard PD controller. Consequently, software-based RAID systems are typically much less expensive to implement than hardware-based RAID systems. Therefore, in many cases, software-based RAID systems that allow a suitable level of performance to be achieved are desirable alternatives to hardware-based RAID systems. However, configuration conflicts can occur in software-based RAID systems, resulting in system errors. In particular, a configuration conflict can occur in a software-based RAID system when a PD is reallocated to a different port number in the event of a loss of power to the PD during bootup.
FIGS. 1A and 2B illustrate block diagrams of two expanders, E0 and E1, for holding PDs in a software-based RAID system. For ease of illustration, the other components of the software-based RAID system are not shown in FIGS. 1A and 1B. The expanders E0 and E1 are enclosures having a plurality of bays formed therein for holding the PDs. In the example shown in FIGS. 1A and 1B, expander E0 has eight bays (Bay 0-Bay 7) and expander E1 has six bays (Bay 0-Bay 5). An I/O interface device (not shown) of the RAID system accesses the PDs held in the bays of the expanders E0 and E1 in order to read data from and write data to the PDs. The RAID system includes tables that provide a logical mapping of the port numbers of the I/O interface device to the bay numbers of the expanders E0 and E1. Thus, the PDs are electrically connected to respective bays, which, in turn, are electrically connected to respective ports of the I/O interface device. Consequently, the mapping of the port numbers of the I/O interface device to the bay numbers of the expanders provides a mapping of the respective port numbers of the I/O interface device to the respective identifiers of the respective PDs held in the bays of the expanders E0 and E1. The manner in which this configuration can lead to the occurrence of a configuration conflict will now be described with reference to FIGS. 1A and 1B.
For this example, it will be assumed that expanders E0 and E1 are assigned port numbers [8-15] and [18-23], respectively It will also be assumed for this example that the user configures the RAID system to have two RAID level 1 (R1) VDs, namely, VD0 and VD1. It will also be assumed that the system is configured such that VD0 comprises PDs 0 and 1 located at port numbers 9 and 18, respectively, and that VD1 comprises PDs 2 and 3 located at port numbers 19 and 20, respectively. This configuration is shown in FIG. 1A. Each of the PDs 0, 1, 2, and 3 includes a disk data format (DDF) file that stores configuration information about the PD, including the bay number in which the PD is held. From the DDFs of the PDs 0 and 1, VD0 is capable of ascertaining that PDs 0 and 1 are contained in bays 1 and 0 of expanders E0 and El, respectively, and that PDs 0 and 1 are mapped to port numbers 9 and 18, respectively. Likewise, from the DDFs of the PDs 2 and 3, VD1 is capable of ascertaining that PDs 2 and 3 are contained in bays 2 and 3 of expander E1 and that PDs 2 and 3 are mapped to port numbers 19 and 20, respectively.
The configuration of VD0 across multiple expanders can result in configuration conflicts. If, for example, the power cable providing power to expander E0 becomes disconnected or fails during boot up, then E1 will be the only expander operating. In this scenario, the I/O interface device normally performs a remapping of the port numbers that are assigned to expander E1 as follows: port numbers 18, 19 and 20 that were originally mapped to bays 0, 1 and 2, respectively, of expander E1 will be reassigned to port numbers 8, 9 and 10, respectively. However, the DDF of the missing PD, namely PD0, has not yet been updated and still indicates that PD0 is in bay 9. Therefore, VD0 understands from the DDF of PD 0 that port number 9 is assigned to PD0. The missing PD is referred to herein as a logical PD because its VD believes the PD is present even though the PD is physically missing. The DDF of the physically and logically present PDs that have been relocated have been updated to indicate the bay numbers and port numbers to which they have been remapped. Therefore, VD1 knows from the DDF of PD 2 that PD 2 has been remapped to bay number 1 and port number 9. Consequently, VD0 and VD1 are both claiming ownership of port number 9, which results in a configuration conflict. For the software-based RAID system to operate properly, the configuration conflict must be resolved.
Currently, no satisfactory solution exists for resolving such configuration conflicts in software-based RAID systems. When a software-based RAID system is booted up, the host server central processing unit (CPU) executes Basic Input/Output System (BIOS) Power-On Self Test (POST) code stored in an Option Read Only Memory (ROM) device of the I/O interface device. When the CPU executes this code, it performs checks to determine whether more than one logical PD has a DDF that indicates that it is connected to the same port. If the CPU determines that the DDFs of two logical PDs indicate that they are connected to the same port, the CPU causes a message to be displayed to the user on the display device of the RAID system that advises the user that: BIOS has detected configured disks with some drive(s) missing; the user needs to power down the system and disconnect one of the PDs located at the specific port number; if the user fails to do so, the configuration will be lost.
If the user ignores the warning message and proceeds, the VD configuration will be lost. If the user heeds the warning message and disconnects the PD, the CPU will remove the corresponding logical PD from the VD, thereby resolving the configuration conflict. However, this results in a reduction in the storage capacity of the software-based RAID system. A need exists for a software-based RAID system in which configuration conflicts can be satisfactorily resolved and a method for satisfactorily resolving configuration conflicts in a software-based RAID system.