The present invention generally relates to high availability disk subsystems, and more specifically to mirroring data in such a high availability disk subsystem.
Mirroring is the duplication of data on two or more disk drives. It is done primarily for data integrity protection. Originally, files were duplicated. This feature was available in the early 1970""s on the Burroughs (now Unisys) 6700 series of computers. Later, entire disk drives were mirrored. Each disk write is duplicated to two or more disks. Reads from the disk(s) can be from either disk. This provided performance benefits in read operations and a small decrease of performances in write operations. Then, if there is a failure of one of the disks, reading can continue uninterrupted from the other disk. This significantly increases the data integrity and availability, and has an important ingredient of on-line transaction systems for a number of years. Availability was further enhanced by utilizing different disk controllers for the mirrored disk drives. Thus, a failure of a mirrored disk drive, its disk controller, the host adapter, or any of the cabling, would not prevent immediate access to the data.
Originally, mirroring was done at the host computer level. The host computer would issue individual write requests to each of the mirrored disks. The host computer would also keep track of which disks are currently available. Software to synchronize mirrored disks or files would also be host computer based.
More recently, mirroring at the disk controller level has been introduced. This eliminates the need for the host computer to handle mirrored disks or files any differently than it does other disks or files. Currently, there are commercial disk subsystems available that support mirroring functions. Typically mirroring works by writing data to each of the mirrored disk drives before acknowledging the write to the computer that issued the write request. Typically again, one disk controller will control the writing of data to both disk drives.
In a high availability disk subsystem, the two (or more) mirrored disks could be controlled by different disk controllers. Thus, mirroring is done in these cases by having the first (xe2x80x9cPrimaryxe2x80x9d) disk controller write data to a first (xe2x80x9cPrimaryxe2x80x9d) disk. Meanwhile the primary disk controller transmits the data to be written to the second (xe2x80x9cSecondaryxe2x80x9d) disk controller. The secondary disk controller then writes the data to a second (xe2x80x9cSecondaryxe2x80x9d) disk drive. Upon completion of this write, the secondary disk controller transmits an acknowledgement to the primary disk controller, which in turn transmits an acknowledgement to the computer when the two disk drives are written.
Some high availability disk subsystems use the same controller to write the data to the two disk drives (xe2x80x9cPrimaryxe2x80x9d and xe2x80x9cSecondaryxe2x80x9d) but, to be protected against failures, the data to be written is saved also in the cache of the second controller. In case of failure of the first controller during the write operations, the second controller takes the task and resume the operations.
Traditionally, the primary purpose for mirroring databases or disk drives was for data integrity protection and secondary for high availability. Mirroring is also known as Redundant Array of Independent (or Inexpensive) Disks (xe2x80x9cRAIDxe2x80x9d) level 1 or RAID level 0/1. Data is immediately available when any single element in a disk subsystem is lost, whether the lost element is a disk controller, a disk drive, or cabling between them and a computer. Other types of data integrity protection, such as RAID levels 3 or 5 have been more popular in recent years due to their lower usage of disk space. However, mirroring is becoming much more attractive due to its simpler design as the cost per megabyte of disk space continues to drop year after year. It is also becoming used in RAID level 1/0 that provides mirroring in conjunction with striping for very high reliability/high performance applications.
Mirroring has other benefits too. It can be used as a tool to facilitate copies of different states of a database. This is accomplished by creating a mirror of a database. After the mirror is operational, it is broken at a particular moment. In this way, the secondary copy is the image of the database at a defined instant in time. A third copy of the database can be created at a different instant in time, etc.
Mirroring can also be utilized as a tool to facilitate backup on tapes. This is accomplished by creating a mirror of a disk or database. When the mirror is operational, it is broken at a specified moment in time. In this way, the disk copy is available to be replicated separated on a tape or a tape library, at the data throughput required. At the end of this operation, the mirror is reestablished, and after the resynchronization phase, a new backup can be initialized.
However, many of these features have some limitations in the prior art because the creation or the resynchronization of a mirror has a negative impact on the performance of the disk subsystem because a full replication of disk volumes is required to create or resynchronize the mirror.
Another problem with mirroring is that the two mirrored disks have to be reasonably close together. With current Fibre Channel technology, the two disks drives must be within 10 km (standard value) of each other. This technology limitation in length is needed for data integrity protection and due also to the fact that storage interfaces work in real time. One problem here is that it would often be preferable to separate the disk controllers and their associated disk drives by longer distances in order to minimize the possibility that a common disaster could take out both.
Another current problem with mirroring is that the primary disk controller typically maintains data transmitted to the secondary disk controller in non volatile memory until acknowledged (battery protected cache). This typically consumes significant amounts of valuable non volatile memory space that RAID disk controllers could utilize for caching.
Another problem that arises in high availability disk subsystems is in backing up and checkpointing data, especially in the form of on-line databases. In the prior art, checkpointing and backing up on-line databases typically adversely impacts performance during the time that the backup is being taken. One reason for this is that user accesses to the database must be overlapped with backup accesses to it. Also, it is difficult to get a completely consistent snap-shot copy of an on-line database without shutting down access to the database while the copy is being made. It would thus also be advantageous to provide a mechanism for checkpointing, snap shotting, and backing up files, most particularly online databases, without negatively impacting performance during the time that the backup is being taken.
For these reasons, and for other reasons that will become apparent in this disclosure, an improved method of mirroring disks and databases is beneficial.
FIG. 1 is a block diagram illustrating a General Purpose Computer 20 in a data processing system. The General Purpose Computer 20 has a Computer Processor 22, and Memory 24, connected by a Bus 26. Memory 24 is a relatively high speed machine readable medium and includes Volatile Memories such as DRAM, and SRAM, and Non-Volatile Memories such as, ROM, FLASH, EPROM, and EEPROM. Also connected to the Bus are Secondary Storage 30, External Storage 32, output devices such as a monitor 34, input devices such as a keyboard 36 (with mouse 37), and printers 38. Secondary Storage 30 includes machine-readable media such as hard disk drives (or DASD) and disk sub-systems. External Storage 32 includes machine-readable media such as floppy disks, removable hard drives, magnetic tapes, CD-ROM, and even other computers, possibly connected via a communications line 28. The distinction drawn here between Secondary Storage 30 and External Storage 32 is primarily for convenience in describing the invention. As such, it should be appreciated that there is substantial functional overlap between these elements. Computer software such as data base management software, operating systems, and user programs can be stored in a Computer Software Storage Medium, such as memory 24, Secondary Storage 30, and External Storage 32. Executable versions of computer software 33, can be read from a Non-Volatile Storage Medium such as External Storage 32, Secondary Storage 30, and Non-Volatile Memory and loaded for execution directly into Volatile Memory, executed directly out of Non-Volatile Memory, or stored on the Secondary Storage 30 prior to loading into Volatile Memory for execution.
FIG. 2 is a block diagram illustrating a data processing system 20 having a computer 21 and a disk subsystem 40 with two RAID disk controllers 53, 54 to communicate with and control four disk drives 61, 62, 63, 64. The disk subsystem 40 comprises two RAID disk controllers 53, 54 coupled to four disk drives 61, 62, 63, 64. The computer 21 is bidirectionally coupled 44 to each of the two RAID disk controllers 53.54. Each of the two RAID disk controllers 53, 54 is bidirectionally coupled to 45 and controls each of the four disk drives 61, 62, 63, 64. The two RAID disk controllers 53, 54 are also bidirectionally coupled together 46. This is illustrative of a sample high availability disk subsystem.
In the preferred embodiment, the coupling between the computer 21 and RAID disk controller 53, 54, comprises one or more channels. Two common types of channels 44 in use today are SCSI and Fibre Channel. Other types of computer systems, interconnections, and channels are within the scope of this invention. Note that in a typical case, the computer 21 will typically be multiply connected 44 to each of the two RAID disk controllers 53, 54. Note also that large scale data processing systems will typically include more than one disk subsystem and more than four disks. Additionally, each such RAID disk controller 53, 54 will often communicate with and control more than the four disk drives 61, 62, 63, 64 shown. Two common types of connections in use today are SCSI and Fibre Channel. However, other types of connections are also within the scope of this invention.