This invention pertains generally to an input/output processing structure and method for computer systems having a plurality of processing resources, and more particularly to a multiple level cache structure and multiple level caching method that distributes input/output processing loads including caching operations between the plurality of processors to provide higher performance input/output processing, especially in a server environment.
In some conventional I/O processing systems, such as I/O processing systems made by Mylex Corporation of Fremont, Cali. (and Boulder, Colorado) a first processor is generally dedicated to running the application code, while a second processor is used as a dedicated XOR engine. The XOR processor (XOR engine) performs the exclusive-or (xe2x80x9cXORxe2x80x9d) calculation associated with parity computations.
In Redundant Array of Independent Disk (RAID) terminology, a RAID stripe is made up of all the cache lines which will be stored on the data disks plus the parity disk. A data stripe includes all of the cache lines which are stored on the data disks, minus the parity disk. To compute parity, all of the cache lines which make up a data stripe are XORed together. There are numerous alternative known XOR configurations and XOR processors or engines that accomplish this XOR operation. (See for example, xe2x80x9cError Control Systems For Digital Communication and Storage, Stephen B. Wicker, Prentice Hall, Englewood Cliffs, N.J. 07632, 1995, ISBN 0-13-200809-2, herein incorporated by reference, for theoretical foundations of error detecting and error correcting coding schemes, including XOR computations.)
In the case of storage controllers for RAID storage systems, the application code is the RAID code, which is responsible for managing the data movement from the host interface to the disk storage. This conventional architecture is shown FIG. 1, which shows a high-level architectural diagram of a controller system 101 (such as for example, an exemplary Mylex Corporation DAC960SF, hereinafter also referred to as the xe2x80x9cSFxe2x80x9d system). In a system 101 design such as this, all of the host""s data is cached in RAM 110 associated via a memory interface 112 to the XOR Processor 108. We refer to this XOR processor RAM to as the cache memory 110. The RAM associated with Application processor 102 via a second memory interface 106 is referred to as the Control Store 104.
System 101 also includes a primary bus 114, such as a PCI bus, interfacing between the XOR processor 108 and Application processor 102 and host interface side component Fiber Chips 122, 124. Note that in the FIG. 1 system, the XOR processor and Application processor may be the same unit. System 101 further includes a secondary bus 116, such as a PCI bus, interfacing the XOR processor 108 and Application processor 102 with disk side components SCSI Chips 126, 128, 130, 132, 134, 136. The SCSI chips provide support for data storage subsystems such as individual disk drives 138 or RAID disk subsystems 140.
The problem with attempting to design a high-bandwidth controller using this architectural model is that the system 101 becomes bottle necked due to the XOR processor 100-to-cache memory 110 interface 112, which has heretofore been limited to a speed of about 133 MB per second. Even if and when faster memory interfaces are developed, the architecture is limited.
We now describe a conventional typical RAID controller configuration relative to system 101, illustrated in FIG. 1. If one considers a standard RAID 5 type write operation to a 7+1 group, one can calculate the theoretical maximum bandwidth of the system 101 controller design. The xe2x80x9c7+1xe2x80x9d refers to seven data disks and a single disk allocated to storing parity information. For bandwidth limited write operations, we assume for purposes of this calculation that an entire RAID stripe is written at one time, so for 7n writes from the host (that is, for a complete stripe write operation, n being the size of the cache data line), 7n stores are done into the cache RAM 110, 7n reads are performed from the cache RAM 110 to generate parity, a single n write is performed into the cache RAM 110 for the parity data, and finally 8n reads are performed from the cache RAM 110 to write the host and parity data to disk on the disk side 120. So, a single stripe write from the host actually requires 23n memory operations (7n data stores +In store a parity +7n reads +8n reads) across the XOR memory interface 112.
For a 133 MB per second conventional interface (handling 23n memory operations), the value of n is about 5.78 MB/second, that limits the total bandwidth for the host interface to be 40.48 MB/second. The host interface bandwidth is 7n, or the maximum number of write operations (writes) to achieve 23n total usage of the memory interface 112. This number assumes a theoretical 100 percent efficiency for the memory interface 110, while typical actual maximum throughput is about 19 MB/second yielding a real-world efficiency of only about 47 percent. This level of system throughput may not be adequate for state-of-the-art systems in the future.
Therefore, there remains a need for a solution to the data management problem in the controller to address bandwidth, throughput, and other limitations in architecture and operational method.
This and other problems and limitations are solved by the invention structure and method.
This inventive provides a multiple level cache structure and multiple level caching method that distributes input/output processing loads including caching operations between the plurality of processors to provide higher performance input/output processing, especially in a server environment. In one aspect, the invention provides a method of achieving optimal data throughput by taking full advantage of multiple processing resources (either processors or controllers, or a combination of the two) in a system. In a second aspect, the invention provides a method for managing the allocation of the data caches in such a way as to optimize the host access time and parity generation. In a third aspect, the invention provides a cache allocation for RAID stripes guaranteed to provide the fastest access times for the exclusive-OR (XOR) engine by ensuring that all cache lines are allocated from the same cache level. In a fourth aspect, the invention provides for the allocation of cache lines for RAID levels which do not require parity generation and are allocated in such manner as to maximize utilization of the memory bandwidth to the host interface. In a fifth aspect, the invention provides parity generation which is optimized for the use of whichever processor is least utilized at the time the cache lines are allocated, thereby providing for dynamic load balancing amongst the multiple processing resources available in the system. In a sixth aspect, the invention provides an inventive cache line descriptor which includes enhancements over other conventional approaches for maintaining information about which cache data pool the cache line resides within. In a seventh aspect, the invention provides an inventive cache line descriptor which includes enhancements to allow for movement of cache data from one cache level to another. In an eighth aspect, the invention provides a cache line descriptor that includes enhancements for tracking the cache within which RAID stripe cache lines siblings reside. System, apparatus, and methods to support these aspects alone and in combination are also provided.