1. Field of the Invention
This invention generally relates to data processing systems and more specifically to data storage facilities for use in such data processing systems.
2. Description of Related Art
Early data processing systems comprised a single processor, random access memory and a data storage facility in the form of a single magnetic disk drive. Such systems are still in wide use by small businesses and individuals and as terminals or nodes in a network. The capacities of the single magnetic disk drive associated with such systems are now into the hundred-gigabyte (i.e., 100*109 bytes) range. However, there are many applications in which even these increased capacities no longer are sufficient.
Increased storage capacities required by multi-processing systems with multiple access and increased database sizes have been realized by the development of data storage facilities with disk array storage devices. Concurrently with this development, a need has also arisen to attain redundancy in the data for data integrity purposes. Consequently there now are many applications that require disk storage facilities having terabyte (i.e., 1012 bytes) and even multiple terabyte storage capacities.
Disk array storage devices have become available from the assignee of this invention and others with such capacities. These systems include a connection to a host system that may include one or more processors and random access memory. Data transfer requests, which include data read and data write requests, are received in an interface or host adapter in the data storage facility and processed into commands that the data storage facility recognizes. These systems use cache memory to enhance operations. A cache memory serves as an intermediate data repository between the physical disk drives and the host systems. Cache memories can reduce the time a data storage facility requires to complete a data read or write operation by returning requested data or by receiving data being sent to the data storage facility.
Such data storage facilities are generally characterized by having a single bus structure that interconnects the physical disk drives, the cache memory and the host adapter. All data commands and all data transfers must pass over this single path. As pressure for increasing data storage capacity and transfer rates continues to increase, the single data path can become a bottleneck. To overcome this bottleneck, some data processing systems now incorporate multiple independent disk array storage devices connected to a single host system. Others incorporate multiple disk array storage devices with multiple host systems.
As these data storage facilities have evolved, so have a number of important characteristics or functional specifications, particularly data redundancy and data coherency. Data redundancy addresses two potential problems. Redundancy at a site overcomes a problem of equipment failure. For example, if data redundancy at a site is achieved by mirroring, two or more separate physical disk drives replicate data. If one of those disk drives fails, the data is available at another physical disk drive. Replicating a disk array storage device at a geographically remote site and storing a copy of the data at each site can also achieve data redundancy. This type of data redundancy overcomes the problem of data loss due to destruction of the equipment at one site because the data at the other site is generally preserved.
Data coherency assures the data at different locations within one or more disk storage facilities is synchronized temporally. That is, if data in a set is stored across two or more separate data storage facilities, at any given instant any one data storage facility should be coherent with the data in the other storage facility. Data could become non-coherent, for example, if a pathway from a host to one of the data storage facilities were to be interrupted without promptly terminating transfers to another related data storage facility.
Generally, a customer initially purchases a disk array storage device with a base data storage facility supplied with a number of magnetic disk drives that provide an initial storage capacity. Often times it is the case that this number of drives is less than a maximum number that the device can support. An incremental increase in the total storage capacity can be achieved merely by adding one or more magnetic disk drives to the existing disk array storage device, generally at an incremental cost. However, when it becomes necessary to expand the capacity beyond the maximum capacity of the disk array storage device, it may become necessary to purchase a new base disk array storage device. The cost of this new base disk array storage device, even with a minimal storage capacity, will be greater than the incremental costs incurred by merely adding magnetic disk drives to the existing disk array storage device. The customer may also incur further programming and reconfiguration costs to integrate the new disk array storage device with the existing disk array storage device.
In many applications, additional capacity is concomitant with a need for greater throughput. However, all the read and write operations for such a disk array storage device continue to involve a single cache memory. Although the cache memory might be expanded, its throughput, measured in the possible number of accesses per unit time, does not increase. In these situations, the capacity increases, but at a reduction in performance as greater rates of read and write operations are encountered. As a result, the ability to scale such disk array storage devices becomes difficult. When such performance problems are anticipated, the usual approach is to add an entirely separate disk array storage device to the data processing system and then to deal with the coordination and coherency issues that may arise.
What is needed is a data storage facility that achieves all the foregoing specifications. That is, what is needed is a data storage facility that provides full redundancy with no single point of failure in the system. Such a data storage facility should be scalable both in terms of the number of host systems that can connect to and the total capacity of the data storage facility. The data storage facility should provide a fully redundant distributed cache memory to provide load balancing and fault tolerance for handling data in the cache memory. Such a facility should be constructed from readily available components with common features for manufacturing and cost efficiency and for limiting the need for spare components necessary to insure reliability. Still further the facility should operate with throughput that is relatively independent of actual storage capacity and the number of host systems connected to that data storage facility.
Therefore it is an object of this invention to provide a high-performance, distributed cache data storage facility that is scalable to large data storage capacities.
Another object of this invention is to provide a distributed cache, scalable data storage facility that is fully redundant.
Still another object of this invention is to provide a distributed cache, scalable data storage facility that can be scaled both with respect to the number of host systems it serves and the capacity of the storage facility.
Still another object of this invention is to provide a distributed cache, scalable data storage facility that is constructed of readily available components having a common design and for manufacturing and cost efficiency and for reliability.
In accordance with this invention a data storage facility operates with a plurality of data processors, each of which can issue a host request for performing a data transfer with the data storage facility. The data storage facility comprises a plurality of persistent data storage locations at unique addresses in a common address space and control logic for transferring data to and from the addressed locations. A plurality of processor-controlled data handling nodes respond to a host data transfer request for identifying a specific data storage location. The processor-controlled data handling nodes also include cache memory storage at cache memory locations for that data identified in the host request. Processor-controlled cache tag controller nodes maintain cache tags that identify a specific cache memory location for a data storage location. A first multi-path connection interconnects the data handling and cache tag controller nodes. A second multi-path connection interconnects the plurality of the storage locations and cache memory locations.
In accordance with another aspect of this invention, a data storage facility operates in response to host requests from one or more data processors. The data storage facility includes I/O nodes and cache nodes. The cache nodes comprise cache memory locations. A cache tag controller node contains status information about each entry in the cache memory locations. An I/O node responds to a host request by converting an address in the host request into an address for a specific storage location in the plurality of data storage locations. The cache tag controller converts the address for the data storage location into the address of a cache tag location and a cache memory location. A first multi-path connection interconnects the I/O, cache and cache tag controller nodes. A second multi-path connection interconnects the plurality of the storage locations and cache nodes.
In accordance with still another aspect of this invention, data transfers with a data storage facility in response to a data processor generated host request. The data storage facility has a first plurality of persistent data storage locations. The facility establishes a second plurality of cache memory and cache tag locations adapted to store cache tags with status information about a corresponding cache memory location. The facility responds to an I/O request by converting its address into an address for a specific location in the first plurality of data storage facility common address space. The data storage facility also converts the first common address space address into an address for a cache tag location. The cache tag is tested to determine the presence of a cache memory location that corresponds to the location in the host request. A transfer of data with the corresponding cache memory location is initiated for predetermined values of the corresponding status information.