1. Field of Invention
The present invention relates in general to the digital data processing field and, in particular, to block data storage (i.e., data storage organized and accessed via blocks of fixed size). More particularly, the present invention relates to a mechanism for the allocation, organization and utilization of high performance block storage metadata.
2. Background Art
In the latter half of the twentieth century, there began a phenomenon known as the information revolution. While the information revolution is a historical development broader in scope than any one event or machine, no single device has come to represent the information revolution more than the digital electronic computer. The development of computer systems has surely been a revolution. Each year, computer systems grow faster, store more data, and provide more applications to their users.
A modern computer system typically comprises at least one central processing unit (CPU) and supporting hardware, such as communications buses and memory, necessary to store, retrieve and transfer information. It also includes hardware necessary to communicate with the outside world, such as input/output controllers or storage controllers, and devices attached thereto such as keyboards, monitors, tape drives, disk drives, communication lines coupled to a network, etc. The CPU or CPUs are the heart of the system. They execute the instructions which comprise a computer program and direct the operation of the other system components.
The overall speed of a computer system is typically improved by increasing parallelism, and specifically, by employing multiple CPUs (also referred to as processors). The modest cost of individual processors packaged on integrated circuit chips has made multiprocessor systems practical, although such multiple processors add more layers of complexity to a system.
From the standpoint of the computer's hardware, most systems operate in fundamentally the same manner. Processors are capable of performing very simple operations, such as arithmetic, logical comparisons, and movement of data from one location to another. But each operation is performed very quickly. Sophisticated software at multiple levels directs a computer to perform massive numbers of these simple operations, enabling the computer to perform complex tasks. What is perceived by the user as a new or improved capability of a computer system is made possible by performing essentially the same set of very simple operations, using software having enhanced function, along with faster hardware.
The overall value or worth of a computer system depends largely upon how well the computer system stores, manipulates and analyzes data. When a computer system performs these operations, data are typically organized and accessed via blocks of fixed size. For many types of data streams, the terminology “block” is applied to chunks of the data stream having various fixed sizes. The typical formatting of a magnetic disk (e.g., a hard disk, floppy diskette, etc), for example, provides a block size of 512 bytes (e.g., hard disks, floppy diskettes, etc).
FIG. 1 is a schematic diagram illustrating an example data structure for a conventional sequence 100 (referred to as a “page”) of fixed-size blocks 102 (e.g., 512 bytes). Typically, for performance reasons no metadata is associated with any particular one of the blocks 102 or the page 100 unless such metadata is written within the blocks 102 by an application. Metadata is information describing, or instructions regarding, the associated data blocks. Although there has been recognition in the digital data processing field of the need for high performance block storage metadata to enable new applications, such as data integrity protection, attempts to address this need have met with limited success. Two notable attempts to address this need for high performance block storage metadata are Oracle's Hardware Assisted Resilient Data (HARD) architecture and the T10 End-to-End Data Protection architecture.
The T10 End-to-End (ETE) Data Protection architecture is described in various documents of the T10 technical committee of the InterNational Committee for Information Technology Standards (INCITS), such as T10/03-110r0, T10/03-111r0 and T10/03-176r0. As discussed in more detail below, two important drawbacks of the current T10 ETE Data Protection architecture are: 1) no protection is provided against “stale data”; and 2) very limited space is provided for metadata.
FIG. 2 is a schematic diagram illustrating an example data structure for a conventional sequence 200 (referred to as a “page”) of fixed-size blocks 202 in accordance with the current T10 ETE Data Protection architecture. Each fixed-size block 202 includes a data block 210 (e.g., 512 bytes) and a T10 footer 212 (8 bytes). Each T10 footer 212 consists of three fields, i.e., a Ref Tag field 220 (4 bytes), a Meta Tag field 222 (2 bytes), and a Guard field 224 (2 bytes). The Ref Tag field 220 is a four byte value that holds information identifying within some context the particular data block 210 with which that particular Ref Tag field 220 is associated. Typically, the first transmitted Ref Tag field 220 contains the least significant four bytes of the logical block address (LBA) field of the command associated with the data being transmitted. During a multi-block operation, each subsequent Ref Tag field 220 is incremented by one. The Meta Tag field 222 is a two byte value that is typically held fixed within the context of a single command. The Meta Tag field 222 is generally only meaningful to an application. For example, the Meta Tag field 222 may be a value indicating a logical unit number in a Redundant Array of Inexpensive/Independent Disks (RAID) system. The Guard field 224 is a two byte value computed using the data block 210 with which that particular Guard field 224 is associated. Typically, the Guard field 224 contains the cyclic redundancy check (CRC) of the contents of the data block 210 or, alternatively, may be checksum-based.
It is important to note that under the current T10 ETE Data Protection architecture, metadata is associated with a particular data block 202 but not the page 200. The T10 metadata that is provided under this approach has limited usefulness. The important drawbacks of the current T10 ETE Data Protection architecture mentioned above (i.e., no protection against “stale data”, and very limited space for metadata) find their origin in the limited usefulness of the metadata that is provided under this scheme. First, the current T10 approach allows only 2 bytes (i.e., counting only the Meta Tag field 222) or, at best, a maximum of 6 bytes (i.e., counting both the Ref Tag field 220 and the Meta Tag field 222) for general purpose metadata space, which is not sufficient for general purposes. Second, the current T10 approach does not protect against a form of data corruption known as “stale data”, which is the previous data in a block after data written over that block was lost, e.g., in transit, from write cache, etc. Since the T10 metadata is within the footer 212, stale data appears valid and is therefore undetectable as corrupted.
To address the latter one of these drawbacks (i.e., very limited space for metadata), it is known to include an Unguarded Data Field (UDF) (e.g., 4, 8, 12, . . . , 32 bytes) between the data block 210 and the T10 footer 212. Such a UDF is a multiple byte (e.g., four byte multiples) value that indicates identification information. For example, the UDF may be a software stamp to indicate creator, data type, creation and/or last modification data, data path, or other identification information. See, for example, document T10/03-110r1 or of the T10 technical committee of the INCITS. Again, it is important to note that under this enhanced-version of the current ETE Data Protection architecture, metadata is associated with a particular data block 202 but not the page 200. Moreover, the metadata that is provided under this enhanced scheme does not address the other drawback, i.e., no protection is provided against “stale data”.
Oracle's Hardware Assisted Resilient Data (HARD) architecture is a proprietary architecture the sole purpose of which is to protect Oracle databases from undetected data corruption. Two important drawbacks of Oracle's HARD architecture are: 1) it is proprietary; and 2) as discussed in more detail below, it is limited to only Oracle database applications.
Oracle's HARD architecture is applicable only to Oracle database applications. The HARD architecture does not provide a general purpose architecture for block storage metadata and cannot be used for applications beyond protecting Oracle databases. The HARD architecture is implemented in the server at the application level, where protection metadata is contained within standard blocks, not block extensions. An Oracle database application has functionality to add check data to each data block when issuing write requests. This check data is examined and validated at each read operation of the Oracle database application. To prevent corrupted data blocks generated in the database-to-storage system infrastructure from being written onto the storage disk, it is known to integrate Oracle's data integrity checksum algorithms at the microchip and microcode level into storage systems, such as the Hitachi Freedom Storage Lightning 9900 V Series System. This provides additional verification of the Oracle check data on the storage system side. See, for example, David Broom and Rasha Hasaneen, Technical White Paper entitled “Protecting Oracle Data Integrity: Hitachi Database Validator Technology and the Hitachi Freedom Storage Lightning 9900 V Series Systems”, Hitachi Data Systems, January, 2003. {www.hds.com/assets/pdf/wp127—01_dbvalidator.pdf}. Nonetheless, even when Oracle's HARD architecture is enhanced by integrating Oracle's data integrity checksum algorithms into storage systems, the HARD architecture is still only applicable only to Oracle database applications.
Therefore, a need exists for an enhanced mechanism for the allocation, organization and utilization of high performance block storage metadata.