1. Field of the Invention
The present invention relates to the control of multiple disk drives within computer systems and more particularly to a method for maintaining data redundancy and recovering data stored on a disk in an intelligent mass storage disk drive array subsystem for a personal computer system.
2. Description of the Related Art
Microprocessors and the personal computers which utilize them have become more powerful over the recent years. Currently available personal computers have capabilities easily exceeding the mainframe computers of 20 to 30 years ago and approach the capabilities of many computers currently manufactured. Microprocessors having word sizes of 32 bits wide are now widely available, whereas in the past 8 bits was conventional and 16 bits was common.
Personal computer systems have developed over the years and new uses are being discovered daily. The uses are varied and, as a result, have different requirements for various subsystems forming a complete computer system. Because of production volume requirements and the reduced costs as volumes increase, it is desirable that as many common features as possible are combined into high volume units. This has happened in the personal computer area by developing a basic system unit which generally contains a power supply, provisions for physically mounting the various mass storage devices and a system board, which in turn incorporates a microprocessor, microprocessor related circuitry, connectors for receiving circuit boards containing other subsystems, circuitry related to interfacing the circuit boards to the microprocessor, and memory. The use of connectors and interchangeable circuit boards allows subsystems of the desired capability for each computer system to be easily incorporated into the computer system. The use of interchangeable circuit boards necessitated the development of an interface or bus standard so that the subsystems could be easily designed and problems would not result from incompatible decisions by the system unit designers and the interchangeable circuit board designers.
The use of interchangeable circuit boards and an interface standard, commonly called a bus specification because the various signals are provided to all the connectors over a bus, was incorporated into the original International Business Machines Corporations (IBM) personal computer, the IBM PC. The IBM PC utilized an Intel Corporation 8088 as the microprocessor. The 8088 has an 8 bit, or 1 byte, external data interface but operates on a 16 bit word internally. The 8088 has 20 address lines, which means that it can directly address a maximum of 1 Mbyte of memory. In addition, the memory components available for incorporation in the original IBM PC were relatively slow and expensive as compared to current components. The various subsystems such as video output units or mass storage units, were not complex and also had relatively low performance levels because of the relative simplicity of the devices available at a reasonable costs at that time.
With these various factors and the component choices made in mind, an interface standard was developed and used in the IBM PC. The standard utilized 20 address lines and 8 data lines, had individual lines to indicate input or output (I/O) space or memory space read/write operations, and had limited availability of interrupts and direct memory access (DMA) channels. The complexity of the available components did not require greater flexibility or capabilities of the interface standard to allow the necessary operations to occur. This interface standard was satisfactory for a number of years.
As is inevitable in the computer and electronics industry, capabilities of the various components available increased dramatically. Memory component prices dropped in capacities and speeds increased. Performance rate and capacities of the mass storage subsystems increased, generally by the incorporation of hard disk units for previous floppy disk units. The video processor technology improved so that high resolution color systems were reasonably affordable. These developments all pushed the capabilities of the existing IBM PC interface standard so that the numerous limitations in the interface standard became a problem. With the introduction by Intel Corporation of the 80286, IBM developed a new, more powerful personal computer called the AT. The 80286 has a 16 bit data path and 24 address lines so that it can directly address 16 Mbytes of memory. In addition, the 80286 has an increased speed of operation and can easily perform many operations which taxed 8088 performance limits.
It was desired that the existing subsystem circuit boards be capable of being used in the new AT, so the interface standard used in the PC was utilized and extended. A new interface standard was developed, which has become known as the industry standard architecture (ISA). A second connector for each location was added to contain additional lines for the signals used in the extension. These lines included additional address and data lines to allow the use of the 24 bit addressing capability and 16 bit data transfers, additional interrupt and direct memory access lines and lines to indicate whether the subsystems circuit board was capable of using the extended features. While the address values are presented by the 80286 microprocessor relatively early in the operation cycle, the PC interface standard could not utilize the initial portions of the address availability because of different timing standards for the 8088 around which the PC interface was designed. This limited the speed at which operations could occur because they were now limited to the interface standard memory timing specifications and could not operate at the rates available with the 80286. Therefore, the newly added address lines included address signals previously available, but the newly added signals were available at an early time in the cycle. This change in the address signal timing allowed operations which utilized the extended portions of the architecture to operate faster.
With the higher performance components available, it became possible to have a master unit other than the system microprocessor or direct memory access controller operating the bus. However, because of the need to cooperate with circuit boards which operated under the new 16 bit standard or the old 8 bit standard, each master unit was required to understand and operate with all the possible combinations of circuit boards. This increased the complexity of the master unit and resulted in a duplication of components, because the master unit had to incorporate many of the functions and features already performed by the logic and circuitry on the system board and other master units. Additionally, the master unit was required to utilize the direct memory access controller to gain control of the bus, limiting prioritizing and the number of master units possible in a given computer system.
The capability of components continued to increase. Memory speeds and sizes increased, mass storage units and size increased, video unit resolutions increased and Intel Corporation introduced the 80386. The increased capabilities of the components created a desire for the use of master units, but the performance of a master unit was limited by the ISA specification and capabilities. The 80386 could not be fully utilized because it offered the capability to directly address 4 Gbytes of memory using 32 bits of address and could perform 32 bit wide data transfers, while the ISA standard allowed only 16 bits of data and 24 bits of address. The local area network (LAN) concept, where information and file stored on one computer called server and distributed to local work stations having limited or no mass storage capabilities, started becoming practical with the relatively low cost of high capability of components needed for adequate servers and the low costs of the components for work stations. An extension similar to that performed in developing the ISA could be implemented to utilize the 80386's capabilities. However, this type of extension would have certain disadvantages. With the advent of the LAN concept and the high performance requirements of the server and of video graphics work stations used in computer-added design and animation work, the need for a very high data transfer rates became critical. An extension similar to that performed in developing the ISA would not provide this capability, even if slightly shorter standard cycle times were provided, because this would still leave the performance below desired levels.
With the increased performance of computer systems, it became apparent that mass storage subsystems, such as fixed disk drives, played an increasingly important role in the transfer on data to and from the computer system. In the past few years, a new trend in storage subsystems has emerged for improving data transfer performance, capacity and reliability. This is generally known as a disk array subsystem. One key reason for wanting to build a disk array subsystem is to create a logical device that has very high data transfer rate. This may be accomplished by "ganging" multiple standard disk drives together and transferring data to or from these drives to the system memory. If n drives are ganged together, then the effective data transferred rate is increased n times. This technique, called "striping" originated in the super computing environment where the transfer of large amounts of data to and from secondary storage is a frequent requirement. With this approach, the end physical drives would become a single logical device and may be implemented either through software or hardware.
Two data redundancy techniques have generally been used to restore data in the event of a catastrophic drive failure. One technique is that of a mirrored drive. A mirrored drive in effect creates a redundant data drive for each data drive. A write to a disk array utilizing the mirrored drive fault tolerance technique will result in a write to the primary data disk and a write to its mirror drive. This technique results in a minimum loss of performance in the disk array. However, there exist certain disadvantages to the use of mirrored drive fault tolerance techniques. The primary disadvantage is that this technique uses 50% of total data storage available for redundancy purposes. This results in a relatively high cost per available storage.
Another technique is the use of a parity scheme which reads data blocks being written to various drives within the array and uses a known exclusive or (XOR) technique to create parity information which is written to a reserved or parity drive in the array. The advantage to this technique is that it may be used to minimize the amount of data storage dedicated to redundancy and data recovery purposes when compared with mirror techniques. In an 8 drive array, the parity technique would call for one drive to be used for parity information; 12.5% of total storage is dedicated to redundancy as compared to 50% using the mirror technique. The use of the parity drive technique decreases the cost of data storage. However, there exist a number of disadvantages to the use of parity fault tolerance mode. The primary among the disadvantages is the loss of performance within the disk array as the parity drive must be updated each time a data drive is updated. The data must undergo the XOR process in order to write to the parity drive as well as writing the data to the data drives.
The use of the system processor to perform XOR parity information generation requires that the drive data go from the drives to a transfer buffer, to the system processor local memory to create the XOR parity information and that the parity information be written back to the drive via the transfer buffer. As a result, the host system processor encounters significant overhead in managing the generation of the XOR parity. The use of the local processor within the disk array controller also encounters many of the same problems that a system processor would. The drive data must again go from the drives to a transfer buffer to local processor memory to generate the XOR parity information and then back to the parity drive via the transfer buffer.
Related to this field of data error correction is U.S. Pat. No. 4,775,978 for data error correction system.
A number of reference articles on the design of disk arrays have been published in recent years. These include "Some Design Issues of Disk Arrays" by Spencer Ng, April 1989 IEEE; "Disk Array Systems" by Wes E. Meador, April 1989 IEEE; and "A Case for Redundant Arrays of Inexpensive Disks (RAID)" by D. Patterson, G. Gibson and R. Catts report No. UCB/CSD 87/391, December 1987, Computer Science Division, University of California, Berkley, Calif.
In the past when a drive has failed and has been replaced, it has been necessary to request special commands and operations to restore the data to the disk. Many times these operations require the dedication of the computer system such that it is not available to system users during the rebuild process. Both of these situations create transparency problems when recovering lost data.