As businesses or organizations upgrade their computer systems, they are often left with critical data on legacy systems. The business or organization may be only upgrading their data storage system, yet the process of transferring or migrating the data from the old system to the new system may require many hours of downtime for the computer system. In many applications, this downtime is unacceptable because the data and applications using the data are critical to the business or organization. In other applications, the downtime is costly, representing lost revenue to the company, for example.
Deploying a new storage technology entails many challenges; the new storage system should be easy to set up and should seamlessly integrate with the existing file system data. The new storage technology should allow data to migrate into the new system in an incremental fashion or leave the data on-line in the old system. Incremental migration should not disrupt applications that are operating off the local file system data. Migration of data from one system to another or one storage device to another should be an automated task with minimum system downtime.
Present global-scale storage systems provide data sharing, wreplication, and migration between sites, but none of these present systems are focused on integrating heterogeneous systems. Exemplary present systems comprise AFS, Echo, DFS, Coda, and JetFile. Peer-to-peer systems have also been developed to migrate data from one system; these present systems comprise, for example, Oceanstore, CFS, Past, and FarSite.
Previous work that integrates heterogeneous storage sources focuses on a single client accessing multiple storage servers through a uniform protocol, for example network file system (NFS). A current version of the network file system manages data replication and migration support. Another attribute of network file system allows the user to access the file resource in a different location. However, the data is required to be migrated or replicated by the storage servers themselves. In addition, the virtual file system interface supports heterogeneous source being merged at a single client. These access-oriented or client-based examples provide access to heterogeneous sources, but do nothing to integrate data.
Another example of a system that is currently used to integrate heterogeneous storage sources is the SDSC Storage Resource Broker. The Storage Resource Broker is middleware that provides applications a uniform API to access heterogeneous distributed storage systems. However, the Storage Resource Broker does not provide system-to-system interaction and consequently may not ensure consistency of data in case applications access data directly by bypassing the Storage Resource Broker servers.
Another difficulty arises in introducing new file systems to a system with an existing file system, for example, introducing a distributed file system. A distributed file system is one that runs on more than one computer. For example, company XYZ has computers for each of its employees, and all the file systems of the computers are interconnected in such a way that all the employees can access everyone else's files in the same place at the same name with the same content at any given moment. The computer network of company XYZ is referenced as computer XYZ. All the employees see the same file system but the file system itself runs on each one of the employee's computers. In this example, the data is stored in such a fashion that all the employees can access it. Company XYZ acquires a new company, company ABC. Company ABC has its own computers and data pertaining to the new company, such as payroll, etc. In addition, company ABC is web-based, selling products over the Internet. The computer system of company ABC is referenced as computer ABC.
Company XYZ wishes to move the data from the file system of the computer ABC, or at the least make the file system of computer ABC accessible to computer XYZ. The typical method for transferring the data from the computer of company ABC to the computer of company XYZ would be to backup the data on computer ABC, shut computer ABC, and copy all the data to the computer XYZ. This may take many hours, causing the web-based business of company ABC to be off-line for those hours. This is a very costly and time-consuming procedure.
What is therefore needed is a system, a service, a computer program product, and an associated method for federating an old system into a new system, and optionally migrating data from an old system to a new system. This method should operate seamlessly and efficiently with minimum disruption to existing applications running on the system. Further, this method should ensure data consistency for existing applications while making the data available for migration in a federated system. The need for such a solution has heretofore remained unsatisfied.