In many business enterprises today, reliance on information processing systems grows at an amazing rate. As enterprises become more reliant on information systems and the vast quantities of data stored therein (to the scale of terabytes of data), the losses caused by disruptions and outages become more potentially disastrous. For this reason, many technologies have developed to provide protection against information system failures. Such technologies include storage management, data protection, and application clustering at the local level.
Local protection, however, is inadequate. Loss of an entire data center or information processing facility would greatly impact the business it supports, thus protection at a higher level is necessary. Data replication includes technology designed to maintain a duplicate data set on a completely independent storage system, possibly at a different geographical location from the primary data set. In many systems, the duplicate data set is updated automatically as the primary data set is updated.
There are different, known forms of data replication. In a synchronous replication system, the system ensures that a write update has actually been posted to the secondary data store as well as the primary before the write operation completes at the application level. In a synchronous replication system, the duplicate data set is continuously up-to-date; however, application performance may be affected in that each update requires a “round trip” over the network for the update to the duplicate.
In an asynchronous replication system, the application updates are written at the primary data set and queued for forwarding to the secondary data set as bandwidth allows. Unlike synchronous replication, the writing application does not suffer response time degradation, as there is no wait for the “round trip” to be complete. Near real-time updates are available, though during an outage at the primary data set, transactions that are queued for forwarding, yet incomplete, may be lost.
In addition to the need for data replication as described above, in many cases there is also a need for conversion of the data. Types of conversion that might be necessary include conversion from Big Endian to Little Endian and vice versa, byte size conversion, and character set conversion.
For example, endianess conversion, or byte order conversion, may be necessary. When data is represented by multiple bytes, there is no unique way of ordering the bytes in memory, so the order is subject to a convention called endianess. Some CPUs handle numbers in a format known as big endian. In big endian format, the most significant byte is stored at the lowest memory address. Alternatively, some CPUs handle numbers in a format known as little endian. Little endian format places the least significant byte at the lowest memory address. When a big endian machine and a little endian machine (i.e., one primary data store and its secondary data store back-up) attempt to communicate through reads and writes to each other, the data must be re-formatted to be accessible by the other machine. This conversion between big endian and little endian may be referred to as byte reversal.
Another example of potentially necessary conversion is byte size conversion. Various operating systems may employ different byte sizes that are incompatible with other applications. For example, 64-bit binary data cannot be used by 32-bit applications. 64-bit applications can be compiled and linked on 32-bit systems but cannot be run on them (and vice versa). In order to use the data interchangeably, 32-bit and 64-bit data must be converted from one size to the other when moving between different applications. Such a conversion is referred to as byte size conversion.
Still another example of a type of conversion that may be needed is character set conversion. A character set is the group of unique symbols used for display and printing. Character sets for languages that use the English alphabet generally contain 256 symbols, which is the number of combinations one byte can hold. Given that there are countless different available character sets for different languages and different computing platforms, the need may arise to convert data from one character set to another.