Many businesses rely on large-scale data processing systems for storing and processing business data. FIG. 1 illustrates (in block diagram form) relevant components of a data processing system 10. Data processing system 10 and the description thereof should not be considered prior art to the invention described or claimed herein.
Data processing system 10 includes a host node 12 coupled to data storage systems 14–18. The term coupled should not be limited to what is shown within FIG. 1. Two devices (e.g., host node 12 and data storage system 14) may be coupled together directly or indirectly via a third device. Although data storage systems 14 and 16 appear to be coupled in series with host node 12, the present invention should not be limited thereto. Data storage systems 16 and 18 may be coupled in parallel with host node 12.
Data storage systems 14–18 include data memories 20–24, respectively. Alternatively, data memories 20–24 maybe included within a single data storage system. Each of the data memories 20–24 may take form in one or more dynamic or static random access memories, one or more arrays of magnetic or optical data storage disks, or combinations thereof. Data memories 20–24 should not be limited to the foregoing hardware components; rather, data memories 20–24 may take form in any hardware, software, or combination of hardware and software in which data may be persistently stored and accessed. Data memories 20–24 may take form in a complex construction of several hardware components operating under the direction of software. The data memories may take form in mirrored hardware. It is further noted that the present invention may find use with many types of redundancy/reliability systems. For example, the present invention may be used with Redundant Array of Independent Disks (RAID) systems. Moreover, the present invention should not be limited to use in connection with the host node of a data storage network. The present invention may find use in a storage switch or in any of many distinct appliances that can be used with a data storage system.
Data memory 20 stores data of a primary data volume. The primary data volume is the working volume of data processing system 10. Data memories 22 and 24 store or may be configured to store data of separate data volumes. For purposes of explanation, data memories 22 and 24 will be described as storing data of first and second data volumes, respectively. The first and second data volumes may be point-in-time (PIT) copies of the primary data volume or modified PIT (MPIT) copies of the primary data volume. A PIT copy, as its name implies, is a copy of the primary data volume created at some point-in-time. The first and second data volumes can be used to backup the primary data volume. The first and second data volumes can also be used to facilitate analysis of primary volume data without modifying data of the primary data volume.
As will be more fully described below, the first and second data volumes can be virtual or real. The first data volume is virtual when some data of the first data volume is stored in memory 20 (or 24). The first data volume is real when all data of the first data volume is stored in memory 22. Likewise, second data volume is virtual when some data of the second data volume is stored in memory 20 (or 22). The second data volume is real when all data of the second data volume is stored in memory 24. A virtual data volume can be converted to a real data volume via a background data copying process performed by host node 12. In the background copying process, for example, data of the first virtual volume is copied from memory 20 to memory 22 until all data of the first data volume is stored in memory 22.
FIG. 2 represents (in block diagram form) a logical structure of data memories 20–24. Each memory includes nmax memory blocks into which data can be stored. Each memory block shown in FIG. 2 represents one to an arbitrarily large number of regions in physical memory that store data. The physical regions of a block need not be contiguous to each other. However, the physical regions are viewed as logically contiguous by a data management system executing on host node 12. Further, it is noted that any or all of the memories 20–24 may have more than nmax memory blocks. For purposes of explanation, each block of data memory 20 stores data of the primary data volume. For purposes of explanation, nmax memory blocks of memories 22 and 24 can be allocated by host node 12 for storing data of the first and second data volumes, respectively. Corresponding memory blocks in data memories 20–24 can be equal in size. Thus, memory block 1 of data memory 20 can be equal in size to memory block 1 of data memories 22 and 24. Each of the memory blocks within data memory 20 may be equal in size to each other. Alternatively, the memory blocks in data memory 20 may vary in size.
Host node 12 may take form in a computer system (e.g., a server computer system) that processes requests from client computer systems (not shown). To respond to the requests, host node 12 may be required to process data of the primary data volume. Host node 12 generates read or write-data transactions that access memory 20 in response to receiving requests from client computer systems. Host node 12 is also capable of accessing memory 22 or 24 through read or write-data transactions.
Host node 12 includes a data storage management system (not shown) that takes form in software instructions executing on one or more processors (not shown) within host node 12. The data management system may include a file system and a system for managing the distribution of data of a volume across several memory devices. Volume Manager™ provided by VERITAS Software Corporation of Mountain View, Calif. is an exemplary system for managing the distribution of volume data across memory devices. Volume and disk management products from product software companies also provide a system for managing the distribution of volume data across memory devices. Hardware RAID adapter cards and RAID firmware built into computer systems likewise provide this function.
The first and second volumes can be virtual PIT or MPIT copies of the primary data volume. Host node 12 can create a first virtual volume according to the methods described in copending U.S. patent application Ser. No. 10/143,059 entitled “Method and Apparatus for Creating a Virtual Data Copy” which is incorporated herein by reference in its entirety.
When host node 12 creates the first virtual volume, host node 12 creates a pair of valid/modified (VM) maps such as maps 30 and 32 represented in FIG. 3. Maps 30 and 32 correspond to memories 20 and 22, respectively. FIG. 3 also shows a VM map 34 which will be more fully described below. VM maps 30–34 may be persistently stored in memory of host node 12 or elsewhere. VM maps 30 and 32 include nmax entries of two bits each in the embodiment shown. Each entry of VM map 30 corresponds to a respective block of memory 20, while each entry of VM map 32 corresponds to a respective block of data memory 22. In an alternative embodiment, each entry of VM map 30 may correspond to a respective group of blocks in memory 20, while each entry of VM map 32 may correspond to a respective group of blocks in memory 22.
The first and second bits in each entry are designated Vn and Mn, respectively. Vn in each entry, depending on its state, indicates whether its corresponding block n in memory contains valid data. For example, when set to logical 1, V2 of VM map 30 indicates that block 2 of memory 20 contains valid primary volume data, and when set to logical 0, V2 of VM map 30 indicates that block 2 of memory 20 contains no valid primary volume data. It is noted that when Vn is set to logical 0, its corresponding memory block may contain data, but this data is not considered valid. V2 of VM map 32, when set to logical 1, indicates that block 2 of memory 22 contains valid data of the first volume (e.g., the first virtual PIT copy). V2 of VM map 32, when set to logical 0, indicates that block 2 of memory 22 does not contain valid data.
Mn in each entry, depending on its state, indicates whether data within block n of the corresponding memory has been modified. For example, when set to logical 1, M3 of VM map 30 indicates that block 3 of memory 20 contains data that was modified via a write-data transaction since creation of the first virtual volume. When set to logical 0, M3 of VM 30 indicates that block 3 of memory 20 contains unmodified data. Likewise, M3 in VM map 32, when set to logical 1, indicates that block 3 in memory 22 contains data that was modified via a write-data transaction since creation of the first virtual volume. When set to logical 0, M3 of VM map 32 indicates that block 3 of memory 22 contains unmodified data.
When VM maps 30 and 32 are first created, each entry of VM map 32 is set to logical 0, thus indicating that memory 22 contains no valid or modified data. For purposes of explanation, it is presumed that each block of data memory 20 contains valid data of the primary volume. Accordingly, Vn of each entry in VM 30 is initially set to logical 1. Lastly, Mn of each entry in VM maps 30 and 32 is initially set to logical 0. Host node 12 can change the state of one or more bits in any map entry using single or separate I/O operations at the memory address that stores the map entry.
After VM maps 30 and 32 are initiated, host node 12 may run a background process to copy data of memory 20 to memory 22 in a block by block or blocks by blocks fashion. Eventually, this background process will transform the first virtual volume into a first real volume. However, before the background copying process is started or completed, host node 12 can modify data of the primary data volume or the first virtual volume.
FIG. 4 illustrates relevant operational aspects of modifying data of the primary data volume via a write-data transaction after creation of the first virtual PIT copy thereof. In FIG. 4, step 40, host node 12 generates a write-data transaction for modifying data in block n of memory 20. This write-data transaction can be generated in response to receiving a request from a client computer system. For the purposes of explanation, the phrase, “modifying data” includes writing new data. In response to generating the write-data transaction, host node 12 accesses VM map 32 to determine whether the data contents of block n in memory 20 has been copied to block n of data memory 22. More particularly, host node 12 accesses VM map 32 to determine whether Vn is set to logical 1 as shown in step 42. Block n of memory 22 will contain valid data (i.e., Vn of VM map 32 is set to logical 1) if the contents of block n in memory 20 were previously copied to block n of memory 22 by the background copying process mentioned above, or in response to a previous write-data transaction to modify data of block n in memory 20. If Vn of VM map 32 is set to logical 0, then the process continues to step 44 where, as shown, host node 12 copies the contents of block n in memory 20 to block n in memory 22. Thereafter, in step 46, host node 12 sets Vn of VM map 32 to logical 1. It is noted that the order of steps 44 and 46 can be reversed in an alternative embodiment. In this alternative embodiment, however, if a crash occurs after the step of setting Vn in VM map 32 to logical 1 but before data of block n in memory 20 is copied to block n of memory 22, then VM map 32 may indicate that block n of data memory 22 contains valid data when, in fact, block n of memory 22 contains no data at all. Host node 12 may be configured to check for and correct such inconsistencies between VM map 32 and data memory 22 when host node 12 recovers from the crash. After step 46, host node 12 sets Mn in VM map 30 to logical 1 as shown in step 50. Before the process ends, host node modifies data in block n of memory 20 in accordance with the write data transaction generated in step 40. It is noted that the order of steps 50 and 52 may be reversed in an alternative embodiment.
FIG. 5 shows the state of VM maps 30 and 32 of FIG. 3 after execution of two write-data transactions for modifying date of the primary data volume and after data is copied from the first two blocks of memory 20 to respective blocks of memory 22 via the background copying process. FIG. 5 shows that the first volume is still in the virtual state since one or more Vn bits of map 32 are set to logical 0.
As noted above, before the background copying process begins or completes, the first virtual volume can be modified in accordance with write-data transactions generated by host node 12. FIG. 6 illustrates relevant operational aspects of modifying data of the first virtual volume via a write-data transaction. In FIG. 6, host node 12 generates a write-data transaction for modifying data of the first volume. Block n of memory 22 is the location where the data to be modified is stored or should be stored. In response to generating the write-data transaction, host node 12 accesses VM map 32 to determine whether block n of memory 22 contains valid data. If block n of memory 22 does not contain valid data, then Vn of VM map 32 is set to logical 0 and the process proceeds to step 64 where host node 12 copies data of block n of memory 20 to block n of memory 22. Thereafter, in step 66 host node sets Vn of VM map 32 to logical 1. It is noted that steps 64 and 66 can be reversed in order in an alternative embodiment. After Vn is set to logical 1 or in response to host node determining in step 62, that Vn is set to logical 1, host node sets Mn of VM map 32 to logical 1 in step 70. Before the process in FIG. 6 ends, host node 12 modifies data in block n of memory 22 as shown in step 72. It is noted that in an alternative embodiment, steps 70 and 72 may be reversed.
FIG. 7 illustrates the state of VM maps 30 and 32 of FIG. 5 after execution of at least one write-data transaction for modifying data of the first virtual PIT copy. FIG. 7 shows that the first volume is in a virtual state since one or more Vn bits of VM map 32 is set to logical 0.
The primary data volume can be restored or synchonized to the contents of the first volume in response to host node 12 receiving a restore command. The restore method includes overwriting data of block n in memory 20 with data of block n of memory 22 for each block n of memory 20 that contains data that differs from data in block n of memory 22. U.S. patent application Ser. No. 10/254,753 entitled “Method and Apparatus for Restoring a Corrupted Data Volume” which is incorporated herein by reference in its entirety illustrates one method for restoring a primary data volume to the contents of a virtual volume.
In one embodiment of the restore method, host node 12 adds a third bit Rn to each entry and VM map 30. FIG. 7 shows VM map 30 with Rn bits added thereto. The state of Rn determines whether its corresponding block n in memory 20 is to be restored, i.e., overwritten with data of block n in memory 22. Rn is initially set to logical 0 if Mn in VM map 30 or 32 is set to logical 1, thus indicating that block n in memory 20 is to be overwritten with data of block n of memory 22. The state of Rn is switched from logical 0 to logical 1 after data of block n of memory 20 is restored or overwritten with data of block n of memory 22. Rn is initially set to logical 1 if Mn in VM map 30 and 32 is set to logical 0, thus indicating that block n in memory 20 need not be overwritten with data of block n in memory 22. Host node 12 can access block n via read or write-data transactions after the state of Rn is switched to logical 1. The restore method is nearly instantaneous in that host node 12 can access the primary data volume via read or a write-data transaction soon after a restore command is issued. In other words, host node 12 need not wait until the state of all Rn bits are switched to logical 1.
Often, it is desirable to sequentially create new virtual PIT copies of the primary data volume during the day. Any one of these virtual PIT copies can be used to restore the primary volume should the primary volume experience data corruption. When the primary data volume is being restored to the contents of one of the virtual PIT copies, host node 12 may start to create a new virtual PIT copy in its schedule of creating virtual PIT copies. However, before host node 12 can create the new virtual PIT copy the process to restore the primary data volume to the contents of one of the virtual PIT copies, must complete (i.e., all Rn bits of VM map 30 are set to logical 1). The operation to restore the primary data volume to the state of the virtual copy may take a substantial amount of time thus delaying the creation of the new virtual PIT copy of the primary volume.