In data storage systems, such as disk drive based systems, there is an inherent latency associated with write (and read) operations. The latency is the result of the time required for physical positioning of the write head over the appropriate area of the recording medium within the disk drive. This delay is typically in the order of 10 milliseconds and amounts to unacceptable performance degradation for many applications. One known solution to the latency delay is to provide a write cache memory for temporarily storing the write data prior to transcription to the disk drive.
Remote or mirrored storage systems are a type of storage system which find use in transactional database applications, as well as other applications. A mirrored storage system includes a primary storage site and a remote or mirrored storage site. The primary storage site receives data from a storage consumer, for example, a server or mainframe computer, and the data is transcribed by a controller to a primary storage device, for example, a disk drive. The remote storage site is coupled to the primary storage site through a communication link. The remote storage site includes a remote storage device and a controller. The controller receives a copy of the data from the primary storage site and transcribes the data to the remote storage device. The remote storage device allows the data to be restored if the primary storage site becomes inoperable.
In a conventional transactional database system, the transactions are processed sequentially. Before the storage consumer can process a second transaction, e.g. a data storage request, acknowledgement of the previous data transcription must be received, and in a mirrored storage system, this means acknowledgement from the primary storage site and also from the remote storage site. This guarantees that the data is securely stored even if either the primary or remote are destroyed.
In a remote mirrored system where the primary site and the remote site are linked by a long communication link, there can be a substantial delay for the data to be transmitted from the primary site to the remote site, and for the acknowledgement to be transmitted back to the primary site from the remote site after the data has been transcribed at the remote site. Such delay can severely degrade the performance of the entire transaction processing system. For example, if the mirrored site is 1,000 km away from the primary site, and the communication link is an uninterrupted optical fiber link, the speed of light inside the optical fiber imposes a transmission delay of approximately 5 milliseconds for transmission of the data to the remote site and an additional 5 milliseconds for the acknowledgement to be returned from the remote site, resulting in a total delay of at least 10 milliseconds. If the storage consumer, e.g. server, must wait for the acknowledgement to process subsequent transactions, then the storage consumer can process at most 100 transactions per second, which is slow by today's server performance standards. This situation is exacerbated by additional delays due to various switching equipment encountered in the communication link.
The distance between the primary storage site and the remote storage site is integral to the safety factor offered by the mirrored storage system, in that the greater the distance the more unlikely it is that an event could incapacitate or destroy both the primary storage site and the remote or mirrored storage site. Therefore, reducing the distance to the mirrored storage site is not a preferred solution to reducing the delay. Also, the use of a simple cache as discussed above does not remove the latency effect without partly defeating the security intended by a mirrored storage system.
Only in cases where high performance is paramount but where the risk of data loss can be tolerated, are caching systems used to hide latency for remote mirroring. In such configurations the controller with cache acts as a proxy to the remote mirror system and spoofs (“fakes”) the acknowledgement that would normally be sent from the remote mirror. Data could be lost if the data fails to reach and be transcribed to the remote mirror while the primary site is destroyed or incapacitated.
Accordingly, there remains a need for a system which can hide the effect of latency for systems such as those having long telecommunication links where the data sender requires acknowledgement of correct transmission to the data recipient while at the same time minimizing the risk of data loss.