This invention relates generally to data storage systems, and more particularly to data storage systems having redundancy arrangements to protect against total system failure in the event of a failure in a component or subassembly of the storage system.
As is known in the art, large host computers and servers (collectively referred to herein as xe2x80x9chost computer/serversxe2x80x9d) require large capacity data storage systems. These large computer/servers generally includes data processors, which perform many operations on data introduced to the host computer/server through peripherals including the data storage system. The results of these operations are output to peripherals, including the storage system.
One type of data storage system is a magnetic disk storage system. Here a bank of disk drives and the host computer/server are coupled together through an interface. The interface includes xe2x80x9cfront endxe2x80x9d or host computer/server controllers (or directors) and xe2x80x9cback-endxe2x80x9d or disk controllers (or directors). The interface operates the controllers (or directors) in such a way that they are transparent to the host computer/server. That is, data is stored in, and retrieved from, the bank of disk drives in such a way that the host computer/server merely thinks it is operating with its own local disk drive. One such system is described in U.S. Pat. No. 5,206,939, entitled xe2x80x9cSystem and Method for Disk Mapping and Data Retrievalxe2x80x9d, inventors Moshe Yanai, Natan Vishlitzky, Bruno Alterescu and Daniel Castel, issued Apr. 27, 1993, and assigned to the same assignee as the present invention.
As described in such U.S. Patent, the interface may also include, in addition to the host computer/server controllers (or directors) and disk controllers (or directors), addressable cache memories. The cache memory is a semiconductor memory and is provided to rapidly store data from the host computer/server before storage in the disk drives, and, on the other hand, store data from the disk drives prior to being sent to the host computer/server. The cache memory being a semiconductor memory, as distinguished from a magnetic memory as in the case of the disk drives, is much faster than the disk drives in reading and writing data.
The host computer/server controllers, disk controllers and cache memory are interconnected through a backplane printed circuit board. More particularly, disk controllers are mounted on disk controller printed circuit boards. The host computer/server controllers are mounted on host computer/server controller printed circuit boards. And, cache memories are mounted on cache memory printed circuit boards. The disk directors, host computer/server directors, and cache memory printed circuit boards plug into the backplane printed circuit board. In order to provide data integrity in case of a failure in a director, the backplane printed circuit board has a pair of buses. One set the disk directors is connected to one bus and another set of the disk directors is connected to the other bus. Likewise, one set the host computer/server directors is connected to one bus and another set of the host computer/server directors is directors connected to the other bus. The cache memories are connected to both buses. Each one of the buses provides data, address and control information.
The arrangement is shown schematically in FIG. 1. Thus, the use of two buses B1, B2 provides a degree of redundancy to protect against a total system failure in the event that the controllers or disk drives connected to one bus, fail. Further, the use of two buses increases the data transfer bandwidth of the system compared to a system having a single bus. Thus, in operation, when the host computer/server 12 wishes to store data, the host computer 12 issues a write request to one of the front-end directors 14 (i.e., host computer/server directors) to perform a write command. One of the front-end directors 14 replies to the request and asks the host computer 12 for the data. After the request has passed to the requesting one of the front-end directors 14, the director 14 determines the size of the data and reserves space in the cache memory 18 to store the request. The front-end director 14 then produces control signals on one of the address memory busses B1, B2 connected to such front-end director 14 to enable the transfer to the cache memory 18. The host computer/server 12 then transfers the data to the front-end director 14. The front-end director 14 then advises the host computer/server 12 that the transfer is complete. The front-end director 14 looks up in a Table, not shown, stored in the cache memory 18 to determine which one of the back-end directors 20 (i.e., disk directors) is to handle this request. The Table maps the host computer/server 12 addresses into an address in the bank 14 of disk drives. The front-end director 14 then puts a notification in a xe2x80x9cmail boxxe2x80x9d (not shown and stored in the cache memory 18) for the back-end director 20, which is to handle the request, the amount of the data and the disk address for the data. Other back-end directors 20 poll the cache memory 18 when they are idle to check their xe2x80x9cmail boxesxe2x80x9d. If the polled xe2x80x9cmail boxxe2x80x9d indicates a transfer is to be made, the back-end director 20 processes the request, addresses the disk drive in the bank 22, reads the data from the cache memory 18 and writes it into the addresses of a disk drive in the bank 22.
When data is to be read from a disk drive in bank 22 to the host computer/server 12 the system operates in a reciprocal manner. More particularly, during a read operation, a read request is instituted by the host computer/server 12 for data at specified memory locations (i.e., a requested data block). One of the front-end directors 14 receives the read request and examines the cache memory 18 to determine whether the requested data block is stored in the cache memory 18. If the requested data block is in the cache memory 18, the requested data block is read from the cache memory 18 and is sent to the host computer/server 12. If the front-end director 14 determines that the requested data block is not in the cache memory 18 (i.e., a so-called xe2x80x9ccache missxe2x80x9d) and the director 14 writes a note in the cache memory 18 (i.e., the xe2x80x9cmail boxxe2x80x9d) that it needs to receive the requested data block. The back-end directors 20 poll the cache memory 18 to determine whether there is an action to be taken (i.e., a read operation of the requested block of data). The one of the back- end directors 20 which poll the cache memory 18 mail box and detects a read operation reads the requested data block and initiates storage of such requested data block stored in the cache memory 18. When the storage is completely written into the cache memory 18, a read complete indication is placed in the xe2x80x9cmail boxxe2x80x9d in the cache memory 18. It is to be noted that the front-end directors 14 are polling the cache memory 18 for read complete indications. When one of the polling front-end directors 14 detects a read complete indication, such front- end director 14 completes the transfer of the requested data which is now stored in the cache memory 18 to the host computer/server 12.
The use of mailboxes and polling requires time to transfer data between the host computer/server 12 and the bank 22 of disk drives thus reducing the operating bandwidth of the interface.
In accordance with the present invention, a method is provided for protecting erroneous data from being stored in a memory, such DATA comprising a series of data words terminating in a Cyclic Redundancy Check (CRC). The method includes: checking the CRC of the data words while delaying the DATA from passing to an output; corrupting the delayed data words if such checking determines a CRC error, such corruption of one of the data words being performed prior to the data words pass to said output; detecting whether such data word at the output is corrupt; and inhibiting storage of such data words in the memory if such one of the data words at the output is detected as being corrupt.
In one embodiment, the corrupting comprising corrupting a parity byte of such data words.
In accordance with another feature of the invention, a system is provided for protecting erroneous data from being stored in a memory, such DATA comprising a series of data words terminating in a Cyclic Redundancy Check (CRC). The system includes: a source of DATA, such DATA comprising a series of bytes each byte having a parity bit, such series of bytes terminating in a Cyclic Redundancy Check (CRC) portion associated with the series of bytes of the DATA; a source of a the CRC portion; a CRC checker fed by the series of bytes of the DATA and the source of the CRC portion, for determining a CRC from the series of bytes and for comparing such determined CRC with the CRC fed by the CRC source; a delay fed by the series of bytes and the parity bits thereof, a selector having a first input thereof fed by the parity bits and a second input thereof fed by the complement of such parity bits, such selector coupling the first input thereof to an output of such selector when the determined CRC is the same as the CRC fed by the CRC source and for coupling the second input thereof to the output when the determined CRC is different from the CRC fed by the CRC source, the output of the selector providing an appended parity bit for the data bytes after such data bytes pass through the delay; and, a detector and control logic for detecting whether such data word at the output is corrupt and for inhibiting storage of such data words in the memory if such one of the data words at the output is detected as being corrupt.
In accordance with another feature of the invention, a system is provided for protecting erroneous data from being stored in a memory, such DATA comprising a series of data words terminating in a Cyclic Redundancy Check (CRC). The system includes: a source of DATA, such DATA comprising a series of data words, each data word having a parity bit, each data word in the series being associated with a clock pulse, such series of data words terminating in a Cyclic Redundancy Check (CRC) portion associated with the series of bytes of the DATA, such CRC portion comprising a predetermined number of CRC words, each one of such CRC words being associated with one of the clock pulses; a source of a the CRC portion; a CRC checker fed by the series of data words and the source of the CRC portion, for determining a CRC from the series of data words and for comparing such determined CRC with the CRC fed by the CRC source; a delay fed by the series of DATA, such delay delaying the DATA by at least the number of CRC words; a selector having a first input thereof fed by the parity bits and a second input thereof fed by the complement of such parity bits, such selector coupling the first input thereof to an output of such selector when the determined CRC is the same as the CRC fed by the CRC source and for coupling the second input thereof to the output when the determined CRC is different from the CRC fed by the CRC source, the output of the selector providing an appended parity bit for the data words after such DATA has passed through the delay; and a detector and control logic for detecting whether such data word at the output is corrupt and for inhibiting storage of such data words in the memory if such one of the data words at the output is detected as being corrupt.
In one embodiment, a second selector is provided. The second selector has a first input fed the DATA and a second input fed by the output of the first-mentioned selector, such second selector coupling either the first input thereof or the second input thereof to an output of the second selector selectively in accordance with a control signal fed to such second selector.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.