I. Field of the Invention
The present invention is a system and method for transferring data between asynchronous devices. More specifically, this invention relates to a system and method for transferring data between a single channel device and multiple asynchronous storage devices simultaneously, and for performing such transfer with improved error detection and fault isolation.
II. Related Art
In recent years, there has been an enormous increase in the speed and general capabilities of computers. This has been driven largely by the desires of people to solve larger and more complex problems. In view of the size of some of these problems and the amounts of raw data that they require, the speed and power of computer peripherals such as mass-storage devices has become very important in order to take full advantage of a powerful computer.
The speed of mass-storage devices has traditionally lagged significantly behind the speed of state-of-the-art computers. This is largely due to the fact that, unlike the computational portion of a computer (referred to as the "host computer"), mass-storage devices contain moving parts. Consequently, there is a continuing need for faster data storage devices which can effectively work in conjunction with today's high speed computers.
At any given point in time, state-of-the-art storage devices are capable of some finite rate of throughput. Thus, the effective speed of the host is often limited in certain respects to whatever this rate of throughput happens to be. In addition, state-of-the-art storage devices tend to be very expensive.
To help alleviate these problems, the concept of integrally connecting multiple storage devices was developed. This concept is described further with regard to FIG. 1.
Referring now to FIG. 1, a storage facility 108 is shown to comprise a channel unit 102, and n storage devices 104. When the storage facility 108 receives data from a host computer 106 for storage (that is, a "write" is requested), the data is sent to a channel unit 102. Either some other device (not shown) or the channel unit 102 itself stripes (that is, splits up) the data into sections or groups. For example, each section might be 2 bytes in length. When the data is sent to the storage devices 104, storage device 1 might, for example, receive the first section, storage device 2 the second, etc., until storage device n receives the n.sup.th section. This process then repeats itself, with storage device 1 receiving the n.sub.th +1 section, etc.
The data is sent to the storage devices 104 from the channel unit 102 across a bus 110. The bus 110 is made up of individual bus links (n links) having one end at one of the storage devices 104, and the other at the channel unit 102. The striped data sent at any given time on the bus 110 (that is, the sum of the data on all n links at a given time) is referred to as a "word." In addition to data, this bus 110 can typically facilitate the transfer of control signals as well.
When the host computer 106 requests that the stored data be retrieved from the storage devices 104 (that is, a "read" is requested), the storage devices 104 of the storage facility 108 send the stored data back to the channel unit 102 across the bus 102. The data is then interleaved together by some means, and sent to the host computer 106. Of course, in order for this do be done efficiently, the data from the individual storage devices 104 should be received by the channel unit 102 concurrently (that is, all the components of the "word" should arrive at the channel unit 102 at once).
Since the host computer 106 interfaces only with the channel unit 102 and not with the individual storage devices 104, the use of multiple storage devices 104 is transparent to the host computer. What the host computer 106 will notice, however, is a tremendous increase in throughput over the use of a single storage device 104.
In the type of devices contemplated by storage facility 108 of FIG. 1, each of the storage devices 104 typically contain a microprocessor, memory, logic and software to control the operation of the storage device 104. Therefore, each of the storage devices 104 contain an oscillator to drive the clocks for the microprocessor and logic. This use of multiple oscillators allows for greater flexibility in design, since components are not required to run at the same speed (or multiple thereof) as a master oscillator. Thus, problems regarding precise synchronizing of phase between devices are not encountered.
In addition to each of storage devices 104 containing its own oscillator, the channel unit 102 is also contemplated to contain its own oscillator to drive the clocks of its microprocessor and logic. Thus, not only are the storage devices 104 operating asynchronously with regard to each other, they are also operating asynchronously with regard to the channel unit 102 as well. In such an environment, all of the data transfers between the storage devices 104 and channel unit 102 take place over multiple asynchronous interfaces.
Use of asynchronous interfaces, however, is not without its own problems. In order for data to be transferred from the storage devices 104 to the channel unit 102 concurrently in an efficient manner, the bus or busses involved in this transfer must be able to accommodate the asynchronicity of the devices discussed above. The problem with sending data from the storage devices 104 to the channel unit 102 over an asynchronous interface is that the storage devices 104 need some indication when to transfer the data. Since each of the storage devices 104 do not continuously communicate with each other or with other devices, some communication mechanism is necessary to enable all of the storage devices 104 to send data concurrently with one another. In addition, the asynchronicity of the system also makes it difficult for the channel unit 102 to know when it can sample the bus, and be assured that the sampled data is valid.
Typically, the nature of asynchronous transfers requires that the asynchronous devices on both sides of a data transfer establish synchronism with each other. Problems of metastability removal (that is, removal of the occurrence of a condition in devices such as flip flops where these devices are unable to maintain a specific state) make this task more difficult.
The use of multiple storage devices 104 generally as described above also can cause problems relating to error detection and fault isolation. This is true especially where cheaper, less reliable storage devices are used. If one device begins to malfunction, it is important to detect it early, and to detect whether the error came from one of the storage devices 104, or the channel unit 102.
Thus, what is needed is a way to receive striped data simultaneously from multiple asynchronous storage devices 104 in an efficient manner, and to provide an interface generally between a channel unit 102 and multiple storage devices 104 having improved error detection and fault isolation.