1. Field of the Invention
This invention relates generally to computer digital storage systems, and in particular to computer systems using a plurality of disk drives.
2. Description of the Related Technology
Personal computers have gained substantial popularity among individual users for both business and home use. Personal computers are now being utilized for jobs heretofore performed by mainframe computers and minicomputers. The rapidly growing popularity in the use of personal computers may, in part, be attributed to the substantial improvement in both its speed of operation and random access memory (RAM) capacity.
Presently, microcomputer processors may operate at 33 Mhz clock rates and utilize 32 bit data and address buses to access two to 256 megabytes of RAM. In general, RAM speed has kept pace with processor operational speeds. However, bulk data storage utilizing magnetic disks have not. To maintain computer system performance, larger main memories are being used to buffer frequent disk access requirements for data retrieval. This may be a solution for applications that will fit entirely in the main memory of the computer and for which memory volatility is not important. However, applications such as transaction-processing which have a high rate of random requests for small amounts of data, or large simulations requiring massive amounts of data that are in excess of the main memory capacity create serious system performance limitations when using presently available disk storage technology.
Disk storage technology relies on the performance obtainable from mechanical devices. A disk storage device is comprised of a least one magnetic oxide coated platter and at least one read and write head. An electric motor is connected to a spindle that causes the platter to rotate at approximately 3600 RPM. The read/write head "floats" just above the surface of the disk platter oxide coating and moves back and forth across the face of the platter perpendicular to its rotational axis. The head moves predefined incremental distances called tracks. The disk platter is subdivided up into a number of tracks that form concentric imaginary circles on the platter face. Each track consists of a number of sectors that further divides the track into contiguously joined arcs forming a 360 degree circle. These sectors pass under the head as the disk platter rotates. A sector contains multiple bytes of data. A byte of data consists of eight bits of binary information.
Data is stored as multiples of eight bit bytes in a disk sector. For example, a sector may contain 512 bytes of data, however, the sector may also contain more or less bytes depending on the disk system and its application. In those systems that utilize sectors of 512 bytes, if less than 512 bytes of data need be stored, then blank or dummy data is added to make up a full sector of data. If more than 512 bytes of data are stored then additional sectors are used.
Storage capacity of disk systems vary widely depending on platter size, number of platters, type of track and sector formatting and the precision of head positioning mechanisms. Mass production of disk systems for personal computers have created low cost and high performance disks having data storage capacities of a hundred megabytes or more. Larger capacity disk systems, utilized in minicomputer and large mainframe computer systems, are typically several thousand megabytes or more. The cost per megabyte of a thousand megabyte disk is more than twice the cost per megabyte of the mass produced disks used in personal computers. Reliability is equivalent between either individual type of disk. However, power consumption and size are much lower per megabyte for the small disk systems.
In choosing a disk system for computer applications requiring thousands of megabytes of data storage one must evaluate two different approaches for implementation of a suitable disk system. The first and traditional approach utilizes one or more large capacity (thousands of megabytes) disks, the second utilizes a large number of smaller capacity disks. Redundant Arrays of Inexpensive Disks ("RAID), based on magnetic disk technology developed for personal computers, offers an attractive alternative to the large capacity disks. RAID arrays offer improved performance, lower power consumption and lower incremental costs for additional capacity than do the large capacity disk systems.
The problem with using a large number of small capacity drives is that disk system reliability degrades to an unacceptable mean time between failures because of the large number of drives that make up the complete disk system. The computer industry has reduced the seriousness of possible frequent failures of multiple disks by using a parity checking and data correcting system designed to operate during all data read and write operations of the disk system. Parity is used to improve the reliability and integrity of disk system data storage by determining if data has been corrupted and in some cases may be used to correct the corrupted data. For example, data bytes contained in sector 1 of disk A may be compared with data bytes contained in corresponding sector 1 of disk B by calculating the exclusive OR (XOR) between each bit of all corresponding bytes and storing the results as corresponding parity bytes in a third disk used exclusively for parity. More than two data disk contents may be used in calculating parity.
Normally, the computer system calculates disk data parity before writing the data to the disks. However, when writing or modifying small amounts of data the disk controller must first read the disks for previously stored corresponding data not being modified in order to calculate a new parity based on the new data to be written and the existing unmodified data. The new calculated parity is written to the parity disk.
A disk normally rotates at a speed of 3600 RPM. At this rotational speed the head passes over a particular sector every 16.67 milliseconds. Thus, the longest time required to align the head with the sector of interest would be less than or equal to 16.67 milliseconds. This delay is called rotational latency time. Data storage and retrieval latency time may be decreased by interleaving the read and write operations of a multiple disk system. Interleaving means that data is read from or written to alternate disks.
When interleaving data operations between multiple disks, a first block of data is written to or read from disk A, a second block of data is handled by disk B, a third block of data is handled by disk C, a fourth block of data is handled by disk D, etc. After all data disks are so utilized, the interleaving cycle starts anew with the first disk A. Interleaving reduces the time required to transfer blocks of data because disk operations may be performed during the rotational latency times of the multiple disk system. For example, to write 10,240 bytes of data onto two disks, each disk having 512 byte sectors, would require the use of 10 sectors from each disk. The first block of data containing 512 bytes is written to or read from the first disk. The next block of data is written to or read from the second disk and subsequent blocks of data containing 512 bytes each are alternately transferred to/from each disk.
When data blocks in excess of 512 bytes are to be written to the disks, the disk controller may buffer an amount of data limited only by the amount of buffer or cache memory available in the controller. The disk controller writes this buffered data to the appropriate sectors of each disk. The controller must wait for the correct sectors to align with the write head before data can be transferred. When transferring data to a disk, a latency time of one revolution of the disk platter may result if the sector to be read or written has just passed the read/write head.
When using multiple disks in a disk storage system, synchronization of rotational speed and sector position may be accomplished by phase-lock-loop control which is well known in the art of disk systems. Phase-lock-loop control of the rotational speed and position allows rotational alignment for each of the corresponding sectors of the drives. Thus, sector 5 of disk 1 may be read or written at the same time as sector 5 of disk 2. Disk synchronization averages rotational latency to one half a revolution or 8.33 milliseconds instead of the possibility of a full revolution latency of 16.67 milliseconds.
Rotational synchronization of all disks also allows simultaneous read or write operations of data and parity for a stripe of data. A stripe comprises the corresponding sectors of data and parity contained on all disks of the system. Thus, from the above example, stripe 5 comprises the data found on sector 5 of each data disk and the sector 5 parity from the parity disk.
When less than a full stripe of data is to be stored, the disk controller must read the data in the stripe not being modified so that the unmodified stripe data and the new data may be utilized to calculate the new stripe parity. However, the corresponding parity disk sector has already passed by the disk head during this read and parity calculation operation. Thus, the new stripe parity cannot be written to the corresponding parity disk sector until the next disk revolution or 16.67 milliseconds after the parity is calculated.
The time latency of a synchronized multiple disk system write operation averages only one half revolution or 8.33 milliseconds, therefore, requiring an additional revolution or 16.67 milliseconds for writing parity in a read-modify-write operation. This additional parity write latency time is unacceptable. Present disk systems using the above techniques simply ignore this poor write performance hoping that the excessive write time latency will be made up by good read performance.