Enterprises face major storage challenges due to the fast growth of their storage needs, the increased complexity of managing the storage, and the requirement for high availability of storage. Another issue is the amount of physical space, i.e. real estate, required to house the storage and associated processing capability of the data center. This may not be such a problem in locations where real estate is not at premium. However, where enterprises are located in city centers, the cost of the real estate is a real issue. In a city center hospital for example, the data center is responsible for storing many different types of data including patient data. As diagnostic technology advances, there is more and more diagnostic data in digital form that needs to be stored and managed. The hospital thus has to manage a conflict between the ever-increasing real estate requirements of the data center and those of the patient treatment facilities.
Data currency is another storage management issue. It is generally the case that new data will be accessed regularly within the first few days or weeks of its creation, and that gradually over time, the data will be accessed less and less. In a hospital, for example, patient data, such as x-ray data, typically needs to be readily accessible in the short term while the patient is undergoing treatment. Once the treatment is complete, this data may not be required for many years but it is generally necessary to keep such data for legal reasons, for research purposes, or if the patient has further medical problems. As another example, imagine a bank where data is stored regarding share transactions. It is likely that analysis will be run on the share transactions that have happened within the last few days to spot trends. After a week this data is less important as the market will have moved on. After several weeks this data will be irrelevant. The data itself is stored so that it can be accessed by the servers performing the analysis—generally high-powered servers and fast reliable storage, and may be stored as records in a database. Once the data has become less useful, there is no need to store it on the fast (expensive) storage, but it may still need to be accessed occasionally.
The need to provide ready access to much-used data while providing archive storage for little-used data is a problem which has been addressed in a number of ways.
In one scheme, little-used data is moved onto tape and stored at a remote site. This has the advantage that it reduces the amount of physical storage required at the local site but the disadvantage that the access time for the archived data is unacceptably slow. An alternative scheme involves the migration of old data to remote disk storage and the use of data management tools at local and remote sites to handle the migration and retrieval of the data when required. In this scheme, the amount of physical storage required at the local site is reduced and the remote data is more accessible than with the previously described tape storage scheme. However, the management of the remote data is not a trivial task and is complicated by the use of different data management tools on the local and remote sites.
These problems are exacerbated when the local and remote sites use different types of servers, operating systems, etc. A wide variety of techniques based on these schemes have been proposed in the prior art. In U.S. Pat. No. 6,032,224, for example, a method of hierarchical storage of data in a computer is described in which the computer includes an interpreter and a hierarchical performance driver which monitors the rates of access of blocks of data stored on the computer's storage devices and transfers blocks of data accessed infrequently from a faster data storage device to a slower data storage device.
Other possibilities include the storage of all the data at either the local site or the remote site. In the former case, this may not be practicable in areas where real estate costs are high and furthermore, it may be difficult to rapidly increase or decrease the amount of available storage as requirements change. In the latter case, the overall costs of storage may be reduced but the host applications at the local site would require some adaptation to be able to cope with extended I/O access times to the remote storage, which could potentially be located on a different continent. This would impact the operation at the local site and could lead to application errors.
The present invention seeks to address one or more of these problems.