The present invention relates generally to performing storage operations on electronic data in a computer network, and more particularly, to data storage systems that employ primary and secondary storage devices wherein certain electronic data from the primary storage device is relocated to the secondary storage device pursuant to a storage policy and electronic data from the second storage device may retrieved directly or through the primary storage device.
The storage of electronic data has evolved over time. During the early development of the computer, storage of electronic data was limited to individual computers. Electronic data was stored in the Random Access Memory (RAM) or some other storage medium such as a magnetic tape or hard drive that was a part of the computer itself.
Later, with the advent of network computing, the storage of electronic data gradually migrated from the individual computer to stand-alone storage devices accessible via a network. These individual network storage devices soon evolved into networked tape drives, optical libraries, Redundant Arrays of Inexpensive Disks (RAID), CD-ROM jukeboxes, and other devices. Common architectures also include network attached storage devices (NAS devices) that are coupled to a particular network (or networks) that are used to provide dedicated storage for various storage operations that may be required by a particular network (e.g., backup operations, archiving, and other storage operations including the management and retrieval of such information).
A NAS device may include a specialized file server or network attached storage system that connects to the network. A NAS device often contains a reduced capacity or minimized operating and file management system (e.g., a microkernel) and normally processes only input/output (I/O) requests by supporting common file sharing protocols such as the Unix network file system (NFS), DOS/Windows, and server message block/common Internet file system (SMB/CIFS). Using traditional local area network protocols such as Ethernet and transmission control protocol/internet protocol (TCP/IP), a NAS device typically enables additional storage to be quickly added by connecting to a network hub or switch.
Hierarchical storage management (HSM) provides for the automatic movement of files from hard disk to slower, less-expensive storage media, or secondary storage. As shown in FIG. 1, the typical migration hierarchy is from magnetic disk 10 to optical disk 20 to tape 30. Conventional HSM software usually monitors hard disk capacity and moves data from one storage level to the next (e.g., from production level to primary storage and/or from primary storage to secondary storage, etc.) based on storage criteria associated with that data such as a storage policy, age, category or other criteria as specified by the network or system administrator. For example, an email system such as Microsoft Outlook™ may have attachments “aged off” (i.e., migrated once an age requirement is met) from production level storage to a network attached storage device By HSM systems. When data is moved off the hard disk, it is typically replaced with a smaller “placeholder” or “stub” file that indicates, among other things, where the original file is located on the secondary storage device.
A stub file may contain some basic information to identify the file itself and also include information indicating the location of the data on the secondary storage device. When the stub file is accessed with the intention of performing a certain storage operation, such as a read or write operation, the file system call (or a read/write request) is trapped by software and a data retrieval process (sometimes referred to as de-migration or restore) is completed prior to satisfying the request. De-migration is often accomplished by inserting specialized software into the I/O stack to intercept read/write requests. The data is usually copied back to the original primary storage location from secondary storage, and then the read/write request is processed as if the file had not been moved. The effect is that the user sees and manipulates the file as the user normally would, except experiencing a small latency initially when the de-migration occurs.
Currently, however, HSM is not commonly practiced in NAS devices. One reason for this is because it is very difficult, if not impossible, to intercept file system calls in NAS devices. Moreover, there are many different types of NAS devices, such as WAFL by Network Appliance of Sunnyvale, Calif., the EMC Celera file system by the EMC Corporation of Hopkinton, Mass., the Netware file system by Novell of Provo, Utah, and other vendors. Most of these systems export their file systems to host computers such as the common Internet file system (CIFS) or the network file system (NFS), but provide no mechanism to run software on their operating systems or reside on the file system stack to intercept read/write or other data requests. Further, many NAS devices are proprietary, which may require a significant reverse-engineering effort to determine how to insert software into the I/O stack to perform HSM operations, reducing portability of an HSM implementation.
Accordingly, what is needed are systems and methods that overcome these and other deficiencies.