Many businesses use redundant copies to enable fast recovery from loss of data. Still other businesses create a point-in-time (PIT) copy of critical data to guard against corruption or inaccessibility of the critical data. Methods for creating PIT copies are well known in the art. One method involves copying data to a magnetic data storage tape via a tape drive using well known backup data storage systems. Other methods of creating a PIT copy involve copying data to a hard disk. Creating a PIT copy on hard disk is preferable to creating a PIT copy on a magnetic data storage tape since data is more quickly read from a hard disk than from to a magnetic data storage tape.
FIG. 1 illustrates relevant components of an exemplary data processing system 10 that creates a PIT copy. More particularly, FIG. 1 shows data processing system 10 having a primary node (e.g., a server computer system) 12 coupled to data storage systems 14 and 16 via data link 18. Data storage systems 14 and 16 include memories 20 and 22, respectively. Each of memories 20 and 22 includes a plurality of hard disks for storing data, it being understood that the term “memory” should not be limited thereto. The hard disks of memory 20 store a data volume V, while the hard disks of memory 22 store a PIT copy of volume V. Primary node 12 can access data in either data volume V or its PIT copy using input/output (IO) transactions transmitted via data link 18.
Primary node 12 may include a data storage management system (not shown) that takes form in software instructions executing on one or more processors (not shown). The data management system includes, in one embodiment, a file system, a system for creating data volumes, and a system for managing the distribution of data of a volume across one or more memory devices. FIG. 2 shows (in block diagram form) data volume V and its PIT copy. Data volume V and its PIT copy are shown having nmax blocks of data. Data of each block is stored in one or more hard disks allocated thereto. Corresponding data blocks in volume V and its PIT copy are equal in size. Thus, data block 1 of volume V can be equal in size to that of data block 1 of the PIT copy. Each of the data blocks in volume V may be equal in size to each other. Alternatively, the data blocks in volume V may vary in size.
As noted above, primary node 12 creates the PIT backup copy of volume V to guard against data corruption or data inaccessibility. When the PIT copy is first created, the hard disks allocated to store the PIT copy may contain no data. Creating a PIT backup copy of volume V is a procedure well known in the art. In essence, the procedure includes primary node 12 copying data from memory allocated to each data block of volume V to memory allocated to store data of a corresponding block of the PIT copy in a block by block process until the entire data content of volume V is copied to the PIT copy. After the PIT copy is created, primary node 12 logs each IO transaction that modifies data of volume V.
Subsequent to the creation of the PIT copy, data within the volume V may become inadvertently corrupted as a result of erroneous software and/or hardware behavior. Moreover, data within volume V may become inaccessible as a result of software and/or hardware failure. Primary node 12 can use the PIT copy to correct corrupted or inaccessible data in volume V. The process for correcting corrupted or inaccessible data, however, can be lengthy. To illustrate, suppose the hard disk memory space allocated to store data of block 2 of volume V becomes inaccessible as a result of hardware failure sometime after creation of the PIT copy. If primary node 12 attempts to access data of block 2, data storage system 14 will generate an IO error. When the error is generated, primary node 12 might create a third volume (designated R in FIG. 2) using unused hard disk memory spaces, copy the data contents of the PIT copy to volume R in a block by block fashion, and modify the contents of volume R according to the transactions logged by the primary node. When this process is complete, volume R should be identical in data content to the volume V, and this means that the memory allocated to block 2 in volume V should contain data identical to data contained in memory allocated to block 2 of volume R. Memory allocated to block 2 in volume R, however, is data accessible. Primary node may then allocate new (and accessible) memory space to block 2 of volume V. Once the new memory is allocated, primary node 12 copies data from memory allocated to block 2 of volume R to the newly allocated memory for block 2 of volume V. After copying data to the newly allocated memory, primary node 12 can access block 2 data using an IO transaction.