The information age has enabled organizations to absorb, produce and analyze massive volumes of data. Nowadays, information in the form of digital data has become part of the core of many organizations' operations. Consequently, data is presently one of the most valuable assets of many organizations in a variety of fields, and in some cases is considered to be the key asset of the organization.
The events of Sep. 11, 2001 exposed the vulnerability of data systems and the precious data stored therein to disasters such as terrorist attacks and various unexpected natural occurrences which could cause massive damage and destruction to facilities housing data storage and processing systems. The survivability and recoverability of data systems following a terrorist attack or other disasters has thus become a major concern of organizations around the world. It has become a necessity for organizations which are reliant upon the data stored in their data systems to ensure the survivability and the recoverability of the organization's data, in a way that the organization can quickly and efficiently recover from any event resulting in massive damage to the organization's data systems.
In order to mitigate massive data loss due to damage or other malfunction at a primary data storage server or system, it is common to backup the primary data storage server or system of an organization. For a backup system to successfully avoid the same data loss due to some event at the primary server, the backup system may be distributed and geographically removed from the primary server to ensure that any event which may damage the primary server is not likely to also affect the integrity of the backup system.
It has been suggested to transmit the data stored in the primary storage system to a secondary storage system, commonly referred to as a mirror server or system. The primary storage system and the mirror storage system may be located at different geographical locations (i.e. remote from one another), such that any event resulting in physical damage or operational failure of the primary storage system is not likely to damage or cause the operational failure of the backup/mirror storage system. This backup technique is commonly dubbed remote mirroring.
Since data storage/processing systems are dynamic, such that new data is regularly written to and read from these systems, via write transactions and read transactions, backup or mirroring systems for these data storage and processing systems usually operate substantially in real-time. The use to substantially real-time data mirroring or backup systems is required to insure that as updated of a version as possible of the data stored in the primary server is backed up at the instant of a failure.
An important feature of the remote mirroring system is an “order-preserving” mechanism. An order-preserving mechanism insures that any two host-ordered transactions A,B (in the sense that B is not initiated until A is acknowledged) are processed in the system exactly in the order that the host initiated them, and that no situation may arise in which B may have been completed while A is not.
An example of an order-preserving mechanism and a non-order-preserving mechanism is illustrated hereinafter. Consider a system comprising a host connected to a storage system, where the latter processes requests sent by the former. Each request is “completed” when a response, or acknowledgement, is returned to the host. A non-order-preserving mechanism would be when the host sends a write request A to the system and then a write request B before A has been completed and acknowledged, this means the host does not establish a strict order with a clear demand that one of the two transactions should be completed first. The storage system can therefore treat those transactions accordingly, without having to follow a strict order relation between them. An order-preserving mechanism would be such that transaction B is not initiated by the host until A has been acknowledged.
One of the main reasons for introducing order-preserving mechanisms is that in case of damage or other malfunction, such a mechanism may prevent inconsistent situations arising in the system. In this way, the host may be able to consistently recover and restart the processes where they were interrupted.
A remote mirroring storage system may comprise local and remote storage elements (i.e. primary and secondary, master and slave etc.), where the remote acts as remote mirroring for data volumes of the local one. In such a system an order-preserving mechanism is necessary as a warranty that requests sent by the host in an ordered fashion will not lead to inconsistent situations in the secondary system in case of damage or malfunction.
The prior art contemplates two main approaches to remote mirroring, each of which handles the problem of “order-preservation” and consistency differently.                Synchronous approach—wherein the local system does not acknowledge a request before it has been sent, fully processed, and acknowledged by the remote system as well. Under synchronous approach, order-preserving is solved inherently. e.g. if write request B is received at the local only after A has been acknowledged by this device, then B will never be processed in the remote one before A was completed and acknowledged there, since, in the first place, A was not acknowledged in the local one before it was acknowledged in the remote. One disadvantage of the synchronous approach is the latency of the transaction at the primary device, as the later needs to wait until the transaction is sent to the secondary, and then processed and acknowledged by it. Still, the system can process several non-ordered requests in parallel and therefore the overall throughput is not generally affected. The synchronous approach to remote mirroring does not prevent the scalability of the system.        Asynchronous approach—wherein no response is expected from the secondary device before the primary device acknowledges the transaction or that the response is given by the local system before it transmits the transaction to the remote one. Asynchronous approaches were introduced in order to solve the latency in transaction problem, which is particularly critical when the remote system is physically distant. One disadvantage is that they don't inherently solve the “order-preserving” problem.        
An example of an asynchronous mirroring methodology may be referred to as “individual remote transmission,” according to which methodology a host writes to the local system, which system acknowledges the transaction once it has completed it, and at the same time places the transaction in a queue for transmission to the remote or secondary device. The transactions in the queue are processed individually: each transaction is completed in the remote system and acknowledged before the second transaction is initiated. This approach trivially solves the latency problem in the local system, and at the same time yields the required “order-preservation”. Clearly if either the local system or the link would fail, whatever data situation is produced in the remote, it corresponds to the order originally established by the host. However, the approach has evident disadvantages: as there is no parallel processing of transactions in the remote device, the overall performance of the system is affected and no scalability exists for the system.
Another example of an asynchronous approach methodology is the “Point in Time Transmission”. This approach is based on the principle that the system is able to reproduce the current state of the system or part of the system when it decides. For instance, at a given time t0 the local system creates a copy of the current state of volume V0. Creating such a copy consumes a certain amount of system resources, and the copy is ready at a later time, say, t1. When the copy is ready, the local system starts transmitting the entire copy of volume V0 to the remote storage element. This operation can either be completed or fail, only for the entire volume. Thus, it ensures a coherent image on the secondary device (the image that existed in volume V0 at time t0). This is the main virtue of the approach. In addition it allows parallel processing of requests and it is thus scalable. On the downside, the time lag between two successive, consistent pictures created at the remote may be relatively long, and thus the data lost in case of damage or malfunction, may be relatively large. The consistent picture preserved in the secondary device may be significantly different from the current one at the primary at the time of damage or malfunction.
Yet another example of an asynchronous approach methodology is the “Time-Stamped Transmission”. As in the “Individual Remote Transmission” methodology, asynchronous approaches may turn to be non-scalable, since they may not process the remote transactions in a fully parallel way. In many such implementations, transactions already acknowledged by the primary device are “chunked” together and sent to be processed by the secondary device, thus limiting the total throughput in the secondary device. This limitation can be overcome by enlarging the size of the chunks (i.e. “Individual Remote Transmission”), but this has the effect of enlarging the time lag between chunks, which in a case of damage or malfunction leads to an increase in the amount of lost data. Another disadvantage is that in order to transmit “point in time copies”, it is necessary to create these copies in the local system, which also consumes considerable amount of time and storage resources. A good compromise can be reached by assigning a timestamp to every transaction in the local system and transmitting the transactions, either individually or in chunks, to the remote system. At any rate, in the remote system, the transactions are processed in an order corresponding to the timestamps. Thus, in case of damage or malfunction in the remote machine or in one of the links, no inconsistency is created in the remote system.
Establishing a global timestamp, simultaneously used in all components of the system, involves a synchronization process that is initiated by a selected component and must be acknowledged by all other components. Such processes are naturally supported by operating systems such as Linux. They always involve, however, a broadcast operation and certain, non-zero response time. Thus, they can conveniently be used for administration and monitoring tasks such as creating log files. They cannot be used, however, as the basis of an order-preserving mechanism in which an absolute synchronization is a must.
Indeed, assume that a host sends a request A to the system via interface node IF1, and that once this request has been completed it sends a second request B via interface IF2. Assume that IF1 assigns a timestamp tA to the completion of transaction A and that IF2 assigns a timestamp tB to the initiation of transaction B. Now, if time synchronization between IF1 and IF2 were perfect, then the timestamps could be used within the storage system as an indication of priority in internal processing of transaction. However, at any point in time there always may exist a non-zero time-lag D between the time as measured in IF1 and in IF2. If T indicates the time actually elapsed in the host between receiving the acknowledgment for request A and the issuing of request B, and if T<D (which is certainly a possibility), then we would obtain a situation where tA>tB. If the system would look at timestamps as its criteria for two transactions being ordered, it would not see A and B as ordered, and inconsistency might arise in a situation of remote mirroring. In fact, the situation may be even worse, if the processing of B turns out to be completed before the processing of A has started.
Therefore, although several methodologies, systems and circuits for providing remote data server mirroring are known and have been implemented, enhancements and improvements to existing server mirroring solutions are needed.