Server-based data management systems have become standard office equipment and the need for data management is growing rapidly. Today, many employees in large corporations have a personal computer (PC) or a workstation that is connected to other computers via a Local Area Network (LAN).
A LAN generally includes a plurality of computer systems, such as computer workstations, that are connected together to share data and resources, such as a main memory and/or a printer. The LAN often includes file servers providing the network services. A file server is generally a node, e.g. a computer, on a computer network that provides service to the computer terminals on the network through managing a shared resource. For example, a file server can manage a set of storage disks and provide storage and archival services to computer terminals on the network that do not have their own disks, or that have data that needs to be stored externally.
Storage requirements of LANs are growing at a staggering rate. Many of today's servers handle gigabytes of data. In addition, the ability to store and protect data has become a critical issue for many network users. The most common way of protecting data is to keep it in more than one location. Server-based data management systems, such as the ARCserve.RTM. data management system, provide back-up and protection of data stored on a LAN file server and/or computer systems connected to the LAN.
Merely providing back-up and storage of data from a computer network, however, is not sufficient. In particular, the external storage of data needs to be automatic, optimal, and transparent to the network user. One technique for providing efficient external storage of data from a computer network is hierarchical storage management (HSM).
HSM includes storing computer network data external to the file server in a hierarchy of secondary, and possibly tertiary, storage devices. The external storage devices are generally high capacity storage devices such as Write Once Optical, Rewriteable Optical and Magnetic Tape. For instance, an optical storage device and a magnetic tape drive can be coupled to the file server as secondary and tertiary storage devices, respectively. Based on criteria established by the HSM application, data stored in the file server can be migrated to the optical storage device and, based on selectable criteria, further migrated to the tape drive.
For example, the frequency of use of the data can be used as a criterion for migrating the data from the file server to the secondary and tertiary storage devices. By migrating data which is infrequently used or accessed, space can be freed on the file server while users continue to scan files as if they still resided on the file server. Migration refers to the movement of data from a file server into a storage hierarchy (e.g. the external storage devices). Demigration refers to the retrieval of data from the storage hierarchy to the file server.
To obtain optimal benefit of a HSM application, the secondary and tertiary storage devices are arranged in a hierarchical arrangement for storing the data. Thus, a data file that has resided on the, network file server for a predetermined period of time can be migrated initially to an optical storage device, which provides for a relatively fast response time when the file is requested by the network file server. If the data file remains on the optical storage device for a predetermined period of time without being requested by the file server, then the data file can be further migrated, in accordance with a storage hierarchy, to a magnetic tape storage device, which has a relatively slow response time compared to the optical storage device. Thus, a hierarchical storage management system provides for a more efficient method of storing the data files of a networked computer system based on the cost, speed and capacity of the hierarchy of storage devices.
When a file is migrated from a file server, the original file is represented on the file server as a stub file, also referred to as a phantom file or a tombstone. The stub file represents the original file while using a minimal physical space allocation, thereby freeing as much space as possible on the file server. The stub file should also represent, however, the properties of the original file as closely as possible, e.g., the file size, the date created, the date last accessed or certain attributes, such as a read only file. Depending on the particular HSM implementation which performs the migration, however, the file size is not accurately represented. Rather, the stub file remaining at the file server has a size of 0, 422 or 1000 bytes, regardless of the actual size of the original file. For example, a 100 megabyte file can be migrated from the network file server to an external storage device and the stub file left on the file server generally will appear with a size of, 0, 422 or 1000 bytes.
Thus, known migration implementations may reduce the physical space allocation of the file server through the use of stub files to represent the migrated file, but the known migration methods do not accurately represent the actual properties of the original file. The accuracy of the representation, particularly the size of the original file, is important information for any software application where file size is utilized. For example, some LAN software applications attempt to provide statistical analysis of the amount of data owned by the file server, or perform some custom function based on particular file sizes reaching a predetermined value. If migrated files are not accurately represented, then the analysis or custom functions may not be properly performed. In addition, a DOS.RTM. operating system DIR command, for example, would provide the wrong file size to the user and lead to user confusion over the actual size of the file. Similarly, a DOS.RTM. operating system COPY command might show a 1000 byte size for a migrated file that is actually 2 megabytes, thus causing the user to attempt to copy the file onto a floppy disk that is too small.
A HSM implementation is generally tailored for particular LAN operating systems. For example, the NOVELL.RTM. NetWare.RTM. operating system is used in many LAN systems. Several versions of the NetWare.RTM. operating system exist, including versions 3.x and 4.x.
For example, in the NetWare.RTM. operating system versions 4.x, a Real Time Data Migrator (RTDM) feature is included. Using this feature, the contents of a file in a NetWare.RTM. file server (e.g. a file server running the NetWare.RTM. operating system) can be migrated to a secondary storage device with a file directory entry representing the migrated file being left in the file server. The file directory entry is empty and thus will not occupy physical space in the NetWare.RTM. file server. In addition, the file directory entry will indicate the correct properties of the migrated file, including the actual size of the migrated file. When the migrated file is requested by the file server, the file will be automatically retrieved into the file server.
Thus, the NetWare.RTM. operating system version 4.x RTDM provides a tool for automatically and transparently migrating files from a NetWare.RTM. volume to secondary storage while keeping accurate directory entries in the original NetWare.RTM. volume for migrated files. On the other hand, the NetWare.RTM. operating system versions 3.x, for example, do not provide a migration functionality. Accordingly, software vendors must create a data migration function for NetWare.RTM. operating system version 3.x file servers. Known migration applications, however, do not provide a directory entry on the file server which is an accurate representation of the migrated file; depending on the application, the remaining directory entry will be a stub file having a size of 0, 422 or 1000 bytes rather than the actual size of the migrated file.
An object of the present invention is to provide for migration of data from, for example, a NetWare.RTM. version 3.x file server that eliminates the use of a stub file that does not accurately represent the size of the migrated file. Another object of the present invention is to provide file migration and demigration that is absolutely transparent to the user.