Present day hunger for data and data storage has given rise to computing complexes in which multiple data processing systems have access to a single data warehouse that is often implemented by a complex of disk storage units. The growth of Internet use has fed this hunger, and added the requirement that the data be continuously available. In order to achieve this latter requirement, many database complexes and data warehouses resort to such techniques as xe2x80x9cmirroringxe2x80x9d (i.e. using redundant storage to maintain a copy of everything written to the main storage element), error correction of various types and the like. Redundant arrays of independent (or inexpensive) disks (RAID) is one example. Certain RAID configurations (xe2x80x9clevelsxe2x80x9d) use data striping (spreading out blocks of each file across multiple disks) in order to protect the data, correcting errors when encounter, but redundancy is not used. This improves performance, but does not deliver fault tolerance. Other RAID levels (e.g., level 1) provide disk mirroring to add data redundancy and thereby fault tolerance.
While these techniques operate well to provide a measure of fault tolerance and, therefore, some continuous availability of the stored data, they can be overloaded when facilities employing these techniques must respond to a large volume of requests for data.
Thus, although continuous availability is now a feature of many database complexes and data warehouse configurations, they still can present a performance impediment by limiting the number of accesses that can be made at any particular moment in time.
The present invention is directed to a disk storage system with a storage control unit capable of receiving and simultaneously responding to multiple input/output (I/O) read requests from multiple users of the storage system.
Broadly, the invention is a disk storage system in which the storage control unit operates to control data transfers (i.e., reads and writes) between a number of host systems and a physical storage formed by a number of disk storage units. The storage control unit is preferably constructed to include multiple processor units (i.e., microprocessors), providing a platform that allows multiple processes to handle a number of simultaneous data transfers between the physical storage and the host systems. The control unit includes memory in which are maintained data structures that implement xe2x80x9clogicalxe2x80x9d storage, comprising a number of logical storage units to which I/O requests, both reads and writes, are made by the host systems. Each logical storage unit has a designated corresponding physical storage area in the physical storage. Data is written to a predetermined one of the logical storage units (the xe2x80x9cmasterxe2x80x9d logical unit), and to its corresponding physical storage area. That data is also copied to the other logical storage units (the xe2x80x9cslavexe2x80x9d logical storage units), and through them to their corresponding physical storage areas. Thereby, multiple copies of the data is made available.
According to one embodiment of the invention, I/O read requests from the host systems are received and assigned to one of the logical storage units and, thereby, to the corresponding physical disk storage. Selection of a logical storage unit for assignment of an I/O read request is preferably made in a manner that distributes I/O read requests among the logical storage units and, thereby, the corresponding physical storage areas. For example, selection may be made on a round-robin basis, or any other basis that achieves a desired distribution among the logical storage units. In this manner, read requests are distributed over the areas of physical storage containing the multiple copies of the data requested.
In an alternate embodiment of the invention, I/O read requests are not immediately assigned to a logical unit. Rather, all the logical units are mapped to their matching physical discs maintaining the copies of the data of the I/O read request, and those physical disk storage areas examined for selection. For example, the physical disk with the smallest number of pending requests may be selected, and the I/O read request is assigned to the logical storage unit corresponding to that selected physical storage.
In a further embodiment of the invention, when an I/O read request is received, a number of the logical units, less than all, are mapped to their corresponding physical storage containing the requested data. The physical disks making up that physical storage are then reviewed to select, for example, the one with a small backlog of pending requests, and the I/O read request is assigned to the logical storage unit corresponding to the selected physical storage.
A number of advantages should now be evident to those skilled in this art. Rather than forming a bottleneck by having only a single data stream from a storage facility to multiple hosts, the storage system of the present invention provides multiple, parallel data paths between multiple processors and physical storage. This, in turn, provides almost instantaneous access to data. In addition, fault tolerance and continuous availability is provided.
These, and other advantages and aspects of the invention will become apparent to those skilled in the art upon reading of the following description of the specific embodiments, which should be taken in conjunction with the accompanying drawings.