1. The Field of the Invention
The invention relates to conducting efficient communications within computing systems and networks. More specifically, the invention relates to efficient messaging among redundant RAID controllers.
2. The Relevant Art
Networking has increased the need for messaging within computing environments. Data is often stored remotely and accessed by multiple computers and other electronic devices via electronic networks. A well-known technique to lower access latency and increase transfers rates is to locally store or “cache” frequently accessed data within fast local memory thus reducing the load on relatively slow transmission channels, links, and storage devices. Caching facilitates faster access speeds by temporarily storing the data of interest on the local system or device.
Caching often results in data records and files, or portions thereof, being distributed in disparate locations. Updating cached data records and files properly is problematic and is known in the art as maintaining cache coherency. Maintaining cache coherency typically involves tracking and monitoring of the various cached versions in a central register or database and sending update messages to update old data at the various disparate locations. Tracking, monitoring and updating has traditionally been expensive in that considerable processing cycles and/or specialized circuitry is required to maintain cache coherency and conduct messaging related to configuration, housekeeping, and error recovery operations.
RAID systems (i.e., systems using Redundant Arrays of Independent Disks) are used to store large quantities of data within computer and storage networks. RAID systems are designed to be fault resistant and fault tolerant by distributing data among redundant arrays of independent disks, usually with some form of error coding. RAID controllers are typically required to receive messages containing access requests and data from a host, acknowledge reception of the requests, and perform the requested transaction. To prevent a weak link within RAID systems, RAID controllers often operate in a dual active configuration where the controllers are paired and take over for each other in the event that one of the controllers fails.
Mirroring is a specific form of caching that is often conducted to maintain redundant copies and thereby facilitate recovering from system errors and failures. Mirroring is particularly desirable in active standby RAID controllers in that a standby controller must have a copy of certain segments of a failed controller's data to successfully recover from a failure and ensure that all write requests are successfully completed.
Mirroring is often an expensive and time-consuming operation. Mirroring requires extensive coordination in that update messages must be generated, received, acknowledged, and processed for every data element that is updated within a cache. The time needed to generate, receive, acknowledge, and process update messages increases a RAID systems vulnerability to unrecoverable errors.
In addition to update messages, additional messaging is required to conduct configuration, housekeeping, and error recovery operations related to data redundancy. What is needed is low-cost, high-speed method, apparatus and system for conducting efficient and effective messaging in distributed computer systems. Furthermore, what is particularly needed is a method and apparatus to track the transmission and reception of messages in distributed computer systems and thereby increase the throughput, efficiency and reliability of message communications. Such an apparatus and method is particularly needed in redundant RAID controllers.