File systems, applications, and database systems frequently use some multiple of storage device block size when writing data to storage devices. A “block” is the smallest unit read and written by a storage device. Typically, disk storage devices use a block size of 512 bytes. The groups of blocks used by applications are called “pages” or “extents” and are the smallest unit of data written to storage devices. When there is a failure resulting in a break in the data flow to storage devices, an incomplete “page” is sometimes written to the storage device. Failures include the abnormal termination of the operating system (“crash”) for local storage devices or power failures in the computer system or the storage network device.
Copying data between local and remote storage subsystems is a widely used method to protect data stores against storage subsystem failures and catastrophic events. Many clustering applications rely on remote mirroring technology to prevent the loss of data during a failure at a production site. Additionally, many clustering applications rely on local backup to prevent data loss during a failure at portions of a production site. It is crucial to customers that these storage technologies are reliable and do not introduce errors or inconsistencies in the data.
Mirroring and local backup is generally implemented at the storage subsystem block level while databases and other applications write data at the page level. The extent size for file systems and the page size for applications to be backed-up is usually some multiple of the storage device block size. Problems can occur when page sizes are greater than the underlying storage device block size. Due to this mismatch, there is a chance for a failure to occur when only a partial page has been written to the storage device.
For example, a page that includes several disk blocks is transmitted to the storage system in one or more write requests that may be separated in time. If the transmission media, such as a fiber channel, is broken after the first disk block is transmitted but before the last disk block is transmitted, the page will be inconsistent at the storage system. The first part of the stored page contains the new data while the rest of stored the page still contains the original data. If the break in the transmission media is the result of a power failure, operating system “crash,” or disaster, it can render the database useless or “unrecoverable”.
Some relational database systems terminate with an error if mismatched page sections are detected. In some database systems and applications, the mismatched page section goes undetected. In that case, the database or application has hidden data inconsistencies.
The only way to obtain the lost data is to retrieve a backup copy of the database or file system from some alternate media, such as magnetic tape. Many customers employ storage subsystem point-in-time copy or remote mirroring to achieve low Recovery Time Objectives (RTO) to get the system backup quickly. Customers also employ storage subsystem remote mirroring to achieve high Recovery Point Objectives (RPO) to minimize data loss as a result of failure. If tape or other backup media are required to recover lost data, the benefits of RTO and RPO are lost.
It is desirable to have a data backup technology that overcomes problems due to mismatching of data page and storage device block sizes in storage subsystems. It is further desirable to have a technology that ensures that the data recorded on a storage device media is consistent. It is further desirable to have a technology that can guarantee a database system is recoverable on a local or remote storage subsystem.