The present invention is directed to a method and apparatus for identifying accesses to a repository of logical objects (e.g., a file system or database) stored on a data storage system by examining information relating to accesses to physical storage locations in the data storage system.
Computer systems typically include one or more processing devices, as well as one or more data storage devices. FIG. 1 is a block diagram of a typical computer system 100, which includes a host computer 110, having a processor 120 and a memory 130, and a storage system 140. The storage system 140 can include any of a number of different types of storage devices (e.g., tape storage devices, floppy diskette storage devices, disk drive storage devices, etc.), or a combination of a number of different types of storage devices.
Application programs for the host computer 110 typically execute on the processor 120 and operate on logical objects (e.g., files, etc.) that are visible to the application programs, and that each includes one or more logically related blocks of data. The logically related blocks of data forming each logical object are physically stored in the storage system 140. Thus, as shown in FIG. 2, a typical computer system 100 can be viewed as having a number of hierarchical spaces or layers, including an application space 310, a physical space 330, and a mapping layer 320 disposed therebetween. As mentioned above, application programs executing on the host computer 110 operate on logical objects (e.g., files) in application space 310. The data forming the logical objects is stored on one or more storage devices 341-343 that are included in the storage system 140 and define the physical space 330. Thus, the data stored in the storage device 140 typically is organized in units of storage termed xe2x80x9cphysical blocksxe2x80x9d that each includes a number of bytes of data (e.g., 512 bytes). Conversely, the logical objects operated upon in application space 310 are made up of xe2x80x9clogical blocksxe2x80x9d. The mapping layer 320 typically is a data structure that maps the logical objects in application space 310 into physical space 330. Although the size of a logical block of data may correspond one-to-one to that of a physical block stored in physical space 330, this is not necessarily the case. Rather, one logical block of data can map to two or more physical blocks of data, or alternatively, multiple logical blocks of data can map to a single physical block of data in physical space 330.
The storage system 140 presents logical volumes of storage to the host computer 100. These logical volumes of storage can each correspond to one of the physical storage devices 341-343 included within the storage system 140. However, when the storage system 140 is an intelligent storage system, it may include a layer of mapping, within the physical space 330, between the logical volumes presented to the host computer 100 and the actual physical storage devices 341-343. Thus, there need not be a one-to-one correspondence between the logical volumes presented to the host computer 110 and the physical storage devices, as a single logical volume can be spread across multiple physical storage devices, or alternatively, a number of physical storage devices can be combined to store the data for a single logical volume.
The mapping layer 320 maps each logical object specified in application space 310 to one or more unique locations (e.g., physical blocks) in physical space 330 where the data forming the logical object is stored. The mapping layer 320 can include a single layer of mapping, such as a file system 322 or a Logical Volume Manager (LVM) 324, or as shown in FIG. 2, can include multiple mapping layers 322 and 324. LVMs typically are used in larger computer systems having a number of storage devices, and enable volumes of data storage to be managed at a logical (rather than physical) level. The presence or absence of the LVM 324 is transparent to both the application space 310 and the file system 322. In this respect, the file system simply maps from the application space 310 to what the file system perceives to be the physical space 330. If another layer of mapping, such as an LVM, is included in the mapping layer 320, it simply means that the result of the mapping done in the file system does not indicate the final mapping to the physical space 330.
In a typical computer system, the storage system 140 has no understanding of the logical relationship between the blocks of data that it stores in physical space 330. This is true because the logical grouping of data is done in the application and mapping spaces 310, 320, and is not passed to the storage system 140. Similarly, in a typical computer system, an application program executing in application space 310 has no understanding of where the data that forms a particular logical object is stored in physical space 330.
In many computer systems, sets of logical objects (e.g, files) are organized at a higher logical level, such that the computer system includes one or more repositories of logical objects. Examples of such repositories include a file system and a database, although other repositories of logical objects are also possible. Such repositories are to be distinguished from a single logical object, which may be made up of multiple logical blocks of storage, but which comprises only a single logical object that is visible to application programs executing in application space.
One illustrative embodiment of the invention is directed to a method for use in a computer system, the computer system including a host computer having an application space and defining a repository of logical objects visible to the application space, the computer system further including a storage system that a defines a physical space wherein data representing the repository of logical objects is stored, the repository of logical objects including a plurality of logical objects. The method comprises acts of: (A) mapping the repository of logical objects from the application space to the physical space to create mapping information identifying which units of storage in the physical space store the repository of logical objects; and (B) making the mapping information visible to the application space.
A further illustrative embodiment of the invention is directed to a computer readable medium encoded with a program for execution on a computer system including a host computer and a storage system, the host computer having an application space and defining a repository of logical objects visible to the application space, the storage system defining a physical space wherein data representing the repository of logical objects is stored, the repository of logical objects including a plurality of logical objects. The program, when executed on the computer system, performs a method comprising acts of: (A) mapping the repository of logical objects from the application space to the physical space to create mapping information identifying which units of storage in the physical space store the repository of logical objects; and (B) making the mapping information visible to the application space.
Another illustrative embodiment of the invention is directed to a method for use in a computer system, the computer system including a host computer and a storage system, the host computer having an application space and defining a repository of logical objects visible to the application space, the repository of logical objects including a plurality of logical objects. The method comprises acts of: (A) executing an operation on the repository of logical objects; and (B) subsequent to the act (A), executing an incremental operation on the repository of logical objects, such that the incremental operation is performed only on those of the plurality of logical objects that have changed subsequent to the execution of the operation in the act (A).
A further illustrative embodiment of the invention is directed to a computer readable medium encoded with a program for execution on a computer systems including a host computer and a storage system, the host computer having an application space and defining a repository of logical objects visible to the application space, the repository of logical objects including a plurality of logical objects. The program, when executed on the computer system, performs a method comprising acts of: (A) executing an operation on the repository of logical objects; and (B) subsequent to the act (A), executing an incremental operation on the repository of logical objects, such that the incremental operation is performed only on those of the plurality of logical objects that have changed subsequent to the execution of the operation in the act (A).
Another illustrative embodiment of the invention is directed to an apparatus for use in a computer system including a host computer and a storage system, the host computer having an application space and defining a repository of logical objects visible to the application space, the storage system defining a physical space wherein data representing the repository of logical objects is stored, the repository of logical objects including a plurality of logical objects. The apparatus comprises at least one controller, coupled to the host computer and the storage system, that maps the repository of logical objects from the application space to the physical space to create mapping information identifying which units of storage in the physical space store the repository of logical objects, the at least one controller making the mapping information visible to the application space on the host computer.
A further illustrative embodiment of the invention is directed to a storage system for use in a computer system including the storage system and a host computer, the host computer having an application space and defining a repository of logical objects visible to the application space, the repository of logical objects including a plurality of logical objects. The storage system comprises at least one storage device that defines a physical space wherein data representing the repository of logical objects is stored, the at least one storage device further storing access information identifying accesses to units of storage in the physical space. The storage system further comprises at least one controller that identifies to the host computer accesses to the repository of logical objects based upon the access information identifying accesses to the corresponding units of storage in physical space that store the repository of logical objects.
Another illustrative embodiment of the invention is directed to a method for use in a computer system, the computer system including a host computer having an application space and defining a repository of logical objects visible to the application space, the repository of logical objects including a plurality of logical objects, the computer system further including a first storage system and a second storage system each coupled to the host computer via a network, the repository of logical objects being stored on the first storage system. The method comprises acts of: (A) determining that a subset of the plurality of logical objects in the repository satisfy a particular selection criterion; and (B) transferring the subset of the plurality of logical objects, but not the entire repository of logical objects, over the network from the first storage system to the second storage.