This invention relates to data storage in a computerized storage unit, such as a storage array in a storage area network (SAN). More particularly, the present invention relates to improved management of stored data in the storage unit using alternating xe2x80x9cshadowxe2x80x9d directories for updating directories in a logical volume in which data accesses are performed primarily to add records to the database.
Current computerized data storage systems typically contain data within logical volumes formed on one or more storage device or array of storage devices. For a logical volume, storage management software typically allocates and manages an amount of storage space that xe2x80x9clogicallyxe2x80x9d appears to be a single xe2x80x9cvolume,xe2x80x9d file or database, but physically may be several files or xe2x80x9csubxe2x80x9d volumes distributed across several storage devices and/or arrays of storage devices.
In addition to the data, the logical volumes also typically contain directories for the data. Typically, a directory hierarchy is used with a root directory and one or more subdirectories arranged in multiple levels. Each directory in the hierarchy is typically contained in a xe2x80x9cblockxe2x80x9d of storage space and includes pointers either to other storage blocks containing additional directories (xe2x80x9cdirectory storage blocksxe2x80x9d) or to storage blocks containing the data (xe2x80x9cdata storage blocksxe2x80x9d). Therefore, when a data storage block is accessed in the logical volume (e.g. to read data from or write data to the data storage block), it is the directories in the directory storage blocks that point to the location of the data storage block, so software can access the data storage block.
The xe2x80x9cactivexe2x80x9d data and directories for the logical volumes with which the software is operating are typically kept in a main memory, or RAM, of the storage devices or storage arrays. Copies of the data and directories for the logical volumes are kept on the hard drives or other mass storage devices. Whenever data is needed that is not currently in the RAM, the data is copied to the RAM to be used. Periodically, the data and directories in the RAM are stored to the hard drives. Whenever a problem, such as a power failure, causes a loss of the data in the RAM, the data is copied from the hard drives to the RAM and operations resume at the point at which the data was last stored to the hard drives.
When data is added to or updated in a data storage block in the logical volume, one or more of the directories in the logical volume must be updated and/or new directories must be created to include pointers to the new data storage block. Subsequently, the updated and/or new directories and the new data storage block are stored to the hard drives. There is the potential of losing some data, or data coherency, in the logical volume if the data and/or directories are being updated or stored to the hard drives at the moment that a problem (e.g. a power failure) occurs. Therefore, updates and changes are typically not made directly to the existing directory and data storage blocks on the hard drives, so the existing information will not be lost or corrupted.
One technique to prevent loss of information involves allocating new directory storage blocks for the affected directories and storing the currently active directories from the RAM to the new directory storage blocks (xe2x80x9cshadow directoriesxe2x80x9d), instead of to the existing directory storage blocks. During the storing of the directories from the RAM to the shadow directories on the hard drives, the previously existing directories on the hard drives are still considered the most recently stored directories for the purpose of restoring the data and directories in the RAM in the case of loss of data in the RAM. Therefore, if a problem results in loss of data in the RAM while the shadow directories are being stored, the previously existing most recently stored directories are used to restore the data in the RAM without loss of data coherency.
The highest level directory, or xe2x80x9crootxe2x80x9d directory, in the directory hierarchy is typically stored last, after the data and lower level directories have been stored to the hard drives. The new root directory includes the time at which it was stored, so the software can determine which root directory is the most recently stored root directory for the purpose of restoring the data and directories, if needed. Therefore, the act of storing the new root directory effectively xe2x80x9cactivatesxe2x80x9d the new root directory and all of the lower level directories linked thereto and the new data in a transition that takes such a short time that the likelihood of the occurrence of a problem is very low.
An exemplary directory hierarchy 100 for a logical volume 102 that is updated with a shadow directory technique is shown in FIGS. 1 and 2. In FIG. 1, the state of the directory hierarchy 100 on the hard drives (not shown) is shown in a progression through five different states 104, 106, 108, 110 and 112 as the logical volume 102 (FIG. 2) is stored from the RAM (not shown) to the hard drives each time that a data record 114, 116, 118, 120 and 122 is added to the logical volume 102. The data hierarchy 100 is shown as having three directory levels 124, 126 and 128. The logical volume 102 is shown as having 18 storage blocks 130-164 for directories or data.
The data record 114 is the first data record to be written to the logical volume 102 (FIG. 2), resulting in the creation of initial root and level 2 and 3 directories 166, 168 and 170 in the directory hierarchy 100 (see state 104). The directories 166, 168 and 170 and the data record 114 are stored from the RAM (not shown) to the hard drives (not shown) without shadow directories, since these directories 166, 168 and 170 are the initial directories. For states 106-110, the data records 116, 118 and 120 are added to the logical volume 102 at different levels in the directory hierarchy 100. The state 112 results from replacing one of the previously added data records (data record 118) with data record 122 and xe2x80x9cwrapping aroundxe2x80x9d data storage from the last storage block 164 to the first available storage block 130.
For each of the states 106-112 that follow the initial state 104, one or more of the current directories on the RAM (not shown), including the current root directory, are stored from the RAM to shadow directories on the hard drives (not shown), so the updates due to each added data record 116-122 can occur to the shadow directories, while the previously existing directories are still considered the most recently stored directories on the hard drives. Additionally, in some cases, new directories are added to the logical volume 102 (FIG. 2).
For example, for state 106, the data record 116 is added to the logical volume 102 (FIG. 2) at the level 3 directory 170 in the RAM (not shown), so the initial root and level 2 and 3 directories 166, 168 and 170 are updated in the RAM. When it is time to store the initial root and level 2 and 3 directories 166, 168 and 170 to the hard drive (not shown), they are stored to shadow root and level 2 and 3 directories 172, 174 and 176, respectively, on the hard drive. Additionally, the data record 116 is added to the logical volume 102 on the hard drive. The shadow level 3 directory 176 includes a pointer to the data record 116. The shadow level 2 directory 174 includes a pointer to the shadow level 3 directory 176, and the shadow root directory 172 includes a pointer to the shadow level 2 directory 174.
After the updated root directory is stored from the RAM (not shown) to the shadow root directory 172 on the hard drive (not shown), the shadow root directory 172 becomes the current most recently stored root directory, effectively xe2x80x9cactivatingxe2x80x9d the level 2 and 3 directories 174 and 176 and xe2x80x9cremovingxe2x80x9d the initial root and level 2 and 3 directories 166, 168 and 170 from the directory hierarchy 100.
For state 108, the data record 118 is added to the logical volume 102 (FIG. 2) at the level 2 directory 174 in the RAM (not shown). Thus, the root and level 2 directories 172 and 174 are updated in the RAM and a new level 3 directory 182 is added to the directory hierarchy 100 in the RAM. Upon storing the updated root and level 2 directories 172 and 174 and the new level 3 directory 182 from the RAM to the hard drives (not shown), the root and level 2 directories 172 and 174 are stored to shadow root and level 2 directories 178 and 180, respectively. The data record 118 is also added to the logical volume 102 on the hard drives. After the updated root directory 172 is stored from the RAM to the shadow root directory 178 on the hard drives, the shadow root directory 178 becomes the currently active root directory, effectively activating the level 2 and 3 directories 180 and 182 and removing the previous root and level 2 directories 172 and 174 from the directory hierarchy 100.
For state 110, the data record 120 is added to the logical volume 102 (FIG. 2) at the root directory 180 in the RAM (not shown). Thus, the root directory 178 is updated in the RAM and new level 2 and 3 directories 186 and 188 are added to the directory hierarchy 100 in the RAM. Upon storing the updated root directory 178 and new level 2 and 3 directories 186 and 188 from the RAM to the hard drives (not shown), the root directory 178 is stored to shadow root directory 184. After the shadow root directory 184 has been stored to the hard drives, the shadow root directory 184 becomes the currently active root directory, effectively activating the new level 2 and 3 directories 186 and 188 and removing the previous root directory 178 from the directory hierarchy 100.
For state 112, the data record 122 replaces the data record 118, so the root directory 184 and level 2 and 3 directories 180 and 182 are updated in the RAM (not shown). When it is time to store the updated root directory 184 and level 2 and 3 directories 180 and 182 from the RAM to the hard drives (not shown), the root directory 184 and level 2 and 3 directories 180 and 182 are stored to shadow root and level 2 and 3 directories 190, 192 and 194, respectively, on the hard drives. After the shadow root directory 190 has been stored, the shadow root directory 190 becomes the currently active root directory, activating the level 2 and 3 directories 192 and 194 and removing the previous root directory 184, level 2 and 3 directories 180 and 182 and the data record 118.
Referring to FIG. 2, at state 104, the first four storage blocks 130-136 of the logical volume 102 have been filled with the initial root and level 2 and 3 directories 166-170 and the data record 114, respectively. At each succeeding state 106-112 the added data and the shadow and new directories are typically created in the next available storage blocks 130-164, and the storage blocks 130-164 previously occupied by directories or data that have been replaced or removed are freed up as available storage blocks in the logical volume 102 on the hard drives (not shown). Additionally, when the last storage block (e.g. 164) has been filled, data storage and directory creation typically xe2x80x9cwrap aroundxe2x80x9d to the first available freed-up storage block 130 in the logical volume 102.
The net result of the additions and deletions of the directories and data records is that the logical volume 102 at state 112 on the hard drives (not shown) has several available storage blocks 134, 138, 140 and 146-154 that are not all contiguous. Storage management techniques, to be able to handle such noncontiguous storage blocks, must keep track of the storage blocks on a block-by-block basis, which can be complicated for large storage systems and which require considerable storage space to contain the information and a lot of processing time to perform the techniques.
It is with respect to these and other background considerations that the present invention has evolved.
An improvement of the present invention is that storage management is simplified in a storage system for which data accesses to a logical volume are primarily for adding data to the logical volume. The aforementioned patent application includes an example of such a storage system for which data accesses to the logical volume (e.g. the snapshot and checkpoint volumes and the point-in-time images described therein) are primarily for adding data to the logical volume. Since the data accesses primarily add data, the data storage blocks in the logical volume rarely, if ever, have to be changed, so the storage management issues regarding noncontiguous storage blocks, resulting from the xe2x80x9cfreeing upxe2x80x9d of storage blocks as described in the background, do not occur for the data storage blocks. Additionally, the present invention includes an improved shadow directory technique that does not result in freeing up noncontiguous directory storage blocks on the storage devices (e.g. hard drives, etc.). Therefore, the storage management complexities required for handling noncontiguous storage blocks (data or directory storage blocks), which resulted from the prior art shadow directory techniques, are eliminated in the present invention. In other words, the improved shadow directory technique enables a simplified, more efficient and faster storage management, which uses less storage space and processing time in a xe2x80x9cprimarily data-add environment.xe2x80x9d
The improved shadow directory technique involves alternating shadow directories between two established directory storage blocks for each directory in the logical volume. Directory storage blocks for every directory are established, or allocated, in pairs, preferably contiguous, and neither directory storage block is ever freed up. Instead, the shadow directory for a current most recently stored directory contained in one of the paired directory storage blocks is always preferably formed in the other one of the paired directory storage blocks. After the data is written and the shadow directories are updated and stored to the storage devices, the shadow directories become the current most recently stored directories. In each pair of directory storage blocks, the directory storage block for the previous, or outdated, directory is maintained as allocated for the directory and reused for the next shadow directory, so there are no resulting freed-up noncontiguous storage spaces.
Since the directory storage blocks and the data storage blocks are never freed up, the storage blocks in the storage device are always preferably allocated sequentially from the first storage block of the logical volume to the last. In other words, the next contiguous storage block after the most recently allocated storage block is always the next storage block to be allocated for a directory or for data in the logical volume on the storage device. Therefore, in order to locate an available storage block to be allocated, the storage management software has to keep track of only the next available storage block instead of all available storage blocks, since all previous storage blocks are known to be allocated and all subsequent storage blocks are known to be available.