Since its advent, the model of a standalone personal computer with removable storage media has had a great effect on the computer industry and has influenced the design of computer system architectures and infrastructures. However, advances in storage solutions and complex computer systems have been occurring rapidly since the time of the first standalone computers. The following are all examples of the increased functionality that networked computer environments have evolved to possess: continued discovery of smaller and smaller integrated circuits and semiconductor chips capable of storing ever increasing quantities of data, increased bandwidth and data transfer rates possible with today's computer networks and increased utilization of server computers in a network in connection with other computers, databases, applications and storage components of all types.
As a consequence, traditional computing and storage techniques and models have been challenged. The widespread use of removable storage media, for example, has been challenged by the ability to remotely store files efficiently and inexpensively. Furthermore, as computer systems have evolved, so has the availability and configuration of data storage devices, such as magnetic or optical disks. For example, these storage devices can be connected to the computer system via a bus, or they can be connected to the computer system via a wired or wireless network. In addition, the storage devices can be separate or co-located in a single cabinet.
As background, a storage volume is a software abstraction of the underlying storage devices and is the smallest self-contained unit of storage mounted by an operating system and administered by the file system. Storage volumes abstract the physical topology of their associated storage devices and may be a fraction of a disk, a whole disk or even multiple disks that are bound into a virtually contiguous range of logical blocks. This binding may increase the fault tolerance, performance, or capacity characteristics of the underlying devices. In today's complex computer system environments, storage volumes can be a diverse set of elements for which efficient and effective management is desirable. A file server for a computer system capable of diverse storage operations maintains and keeps track of data relationships and locations for stored objects, so that common techniques for data storage and transfer may be employed.
Volumes are constructed from one or more extents that are contiguous storage address spaces presented by the underlying storage devices. An extent is typically characterized by the size of the address space and a starting offset for the address space from a base of the media. Volume mapping is the process of mapping contiguous address space presented by the volume onto the non-contiguous storage address spaces of the underlying extents. Volume mappings are either implemented on a specialized hardware controller, referred to as a hardware volume provider, or in software by a software volume provider. By way of further background, a technique for common administration and management of volume providers is provided in commonly assigned copending application Ser. No. 09/449,577, entitled “Administration of RAID Storage Volumes.”
Advances in storage techniques are thus changing the ways in which data can be stored or transferred, thereby placing a strain on the traditional management of files within and between volumes. For instance, files with arbitrary growth criteria, volumes with memory allocation limits, distributed storage and data transferring, and the like challenge the notion of a standalone computer's fixed on-disk memory allocations and management. The shifting of data from fast, volatile memory to remote, robust storage is quite advantageous for certain objects. Thus, advances in networks and computer system models have greater ramifications than simply resulting change in the types of storage components being utilized and in the connections being used between the storage components.
Previously implemented fixed or inflexible memory allocation for volumes do not begin to tap into the efficiencies that may be gained from a robust mechanism for transferring and storing data among a plurality of volumes in a networked computer environment. Techniques traditionally used to manage file transfers were not originally designed to support all of the increased functionality of today's complex network environments. Operating systems, system infrastructure and core file management functions with which many computers operate have thus been affected. As a consequence, current file systems have lingering inefficiency associated therewith and are not equipped to handle all different types of storage and data transfer operations with maximum efficiency.
One such inefficiency exists in connection with storing portion(s) of an object or file away from the root location of the object, for example, to remote storage. With the proliferation of various storage elements and techniques as described above, sometimes it becomes desirable to store portion(s) of a file in remote storage while retaining portion(s) in local storage. This may be desirable, for example, to free up more valuable local storage when portions of a file are known to be static, or to stow away certain data that is infrequently utilized. For another example, an append only file has the characteristic that data writes occur only at the end of the file. Consequently, an efficient use of local storage may dictate that the immutable portions of the file, to which new writes are appended, be migrated to remote storage. For yet another example, migration of data to remote storage might be effected to preserve pre-set on-line disk/memory allocation limits. Thus, there are a variety of reasons why a file may have some data that should be migrated to remote storage. Current file serving techniques, however, do not adequately address either specifying when portions(s) of a file should be migrated or the subsequent migration of data to remote locations while maintaining the file's data relationships.
Thus, as a general rule, partial migration techniques have not been thus far used; nonetheless, it should be noted that there are presently some hierarchical storage management (HSM) systems that can perform limited partial file operations, such as a partial recall. In a traditional HSM system, e.g., an entire on-disk volume may be updated without having to recall any data from remote storage using partial recall operations. Other conventional techniques have addressed the limited case wherein the first few kilobytes of a file, e.g. 4 Kb, are left on-line or ‘unmigrated’, and also the case wherein the last few kilobytes of a file are left on-line or ‘unmigrated.’
However, the current state of the art in hierarchical storage management for files does not cover partial migration of files in most contexts, nor does it address the desire to migrate predetermined part(s) of files from one location to another while retaining other part(s) of files. Further unaddressed by the art is the desirability of a mechanism that specifies those regions of a data stream suited to writes and updates and those regions of a data stream suited to off-line or remote storage. In short, sometimes it is desirable to migrate predetermined part(s) of files to remote storage and to retain other part(s) in local storage and current file servers do not specify which data to keep and which data to export elsewhere.
Additionally, the current state of the art in file management does not address the specific case wherein it is desirable to apply a limit to on-line disk/memory allocations for certain data streams while allowing the entire stream to grow arbitrarily e.g., as might be the case for an append-only data structure. To illustrate, it might be desirable to maintain up to one megabyte of a stream in an on-line volume while allowing the total stream to have a size that is greater than one megabyte. This case is not addressed by today's hierarchical storage management systems.
In consideration of the above insufficiencies associated with current file server/HSM systems, it would be desirable to provide a flexible architecture in a computer system for partially migrating some portion(s) of a file or object to another memory location and retaining other portion(s) of the file. It would be advantageous to be able to specify according to pre-set criteria which portion(s) of an object are suited to migration and which are suited to their present storage location. It would be advantageous to allow for partial migration of files or objects from a first storage location to a second storage location, e.g., from on-line storage to remote storage. It would be still further advantageous to achieve efficient partial migration for files whose structure and properties are known or can be specified as with, for example, append only type files. Thus, a common approach does not exist to move portion(s) of files or objects from a root volume to another or remote volume while maintaining the various data relationships of the file or object. The present invention has been developed in consideration of these needs in the art.