Field of the Invention
The invention relates generally to performing storage operations on electronic data in a computer network, and more particularly, to facilitating storage operations including data stored on a network attached storage device.
The storage of electronic data has evolved over time. During the early development of the computer, storage of electronic data was limited to individual computers. Electronic data was stored in Random Access Memory (RAM) or some other storage medium such as a magnetic tape or a hard drive that was a part of the computer itself.
With the advent of network computing, the storage of electronic data gradually moved from the individual computer to dedicated storage devices accessible via a network. Some of these network storage devices evolved over time into networked tape drives, optical libraries, Redundant Arrays of Inexpensive Disks (RAID), CD-ROM jukeboxes, and other devices. Common architectures also include network attached storage devices (NAS devices) that are coupled to a particular network (or networks) and are used to provide storage capability for various storage operations that may be required by a particular network (e.g., backup operations, archiving, and other storage operations including the management and retrieval of such information).
NAS device typically utilizes a specialized file server or network attached storage system that connects to the network. A NAS device often contains a reduced capacity or minimized operating and file management system (e.g., a microkernel) and normally processes input/output (I/O) requests by supporting common file sharing protocols such as the Unix network file system (NFS), DOS/Windows, and server message block/common Internet file system (SMB/CIFS). Using traditional local area network protocols such as Ethernet and transmission control protocol/internet protocol (TCP/IP), a NAS device typically enables additional storage to be quickly added by connecting to a network hub or switch.
Certain storage management procedures, such as hierarchical storage management (HSM) procedures provides for movement of files from hard disk to slower, less-expensive storage media, or secondary storage over time. As shown in FIG. 1, one migration scheme may include data transfer from a magnetic disk 10 on a computing device to an optical disk 20 and later to a tape 30. Conventional data management software usually monitors hard disk capacity and moves data from one storage level to the next (e.g., from production level to primary storage and/or from primary storage to secondary storage, etc.) based on storage criteria associated with that data such as a storage policy, age, category or other criteria as specified by the network or system administrator. For example, an email system such as MICROSOFT OUTLOOK™. may have attachments “aged off” (i.e., migrated when age requirement is met) from production level storage to a network attached storage device.
Referring to FIG. 2, there is shown a network architecture of a system 200 for performing storage operations on electronic data in a computer network in accordance with the prior art. As shown, system 200 includes a storage manager 201 and one or more of the following: a data store computer 285, a data store 290, a data agent 295, a jobs agent 240, a plurality of media management components 205, which may be referred to as media agents, a plurality of storage devices 215, a plurality of media management component index caches 210 and a storage manager index cache 230.
Data agent 295 is generally a software module that may be responsible for archiving, migrating, and recovering data of data store computer 285 stored in a data store 290 or other memory location. Each data store computer 285 may have a data agent 295 and system 200 can support many data store computers 285.
Each media management component 205 may maintain an index cache 210 which stores index data that system 200 generates during storage operations. The system may maintain two copies of the index data regarding particular stored data. A first copy may be stored with the data copied to a storage device 215. Thus, a tape may contain the stored data as well as index information related to the stored data. In the event of a system restore, the index data stored with the stored data can be used to rebuild a media management component index 205 or other index useful in performing storage operations.
In addition, the media management component 205 that controls the storage operation also may write an additional copy of the index data to its index cache 210. The data in the media management component index cache 210 may be stored on faster media, such as magnetic media, and is thus readily available to the system for use in connection with storage operations and other activities without having to be first retrieved from a slower storage device 215.
Storage manager 201 may also maintain an index cache 230. Storage manager index cache 230 may be used to indicate, track, and associate logical relationships and associations between components of system 200, user preferences, management tasks, and other useful data. For example, storage manager 201 may use its index cache 230 to track logical associations between media management components 205 and storage devices 215. Index caches 230 and 210 may reside on their corresponding storage component's hard disk or other fixed storage device. For example, the media management component 205 may retrieve data from storage manager index cache 230 regarding a storage policy and storage operation to be performed or scheduled for a particular client 285. The media management component 205, either directly or via an interface module, may communicate with the data agent 295 at data store computer 285 regarding the details of an upcoming storage operation.
Jobs agent 240 may also retrieve from index cache 230 information relating to a storage policy 260 associated with data store computer 285. This information may be used in coordinating or establishing actions performed by one or more data agents 295 and one or more media management components 205 associated with performing storage operations for that particular data store computer 285. Such information may also include other information regarding the storage operation to be performed such as retention criteria, encryption criteria, streaming criteria, path information, etc.
Data agent 295 may package or otherwise manipulate client data stored in client data store 290 in accordance with storage policy 260 and/or according to a user preference, and communicate client data to the appropriate media management component(s) 205 for processing. The media management component(s) 205 may store the data according to storage preferences associated with storage policy 260 including storing the generated index data with the stored data, as well
as storing a copy of the generated index data in the media management component index cache 210.
As shown in FIG. 2, a network attached storage device 250 and corresponding file server 254 are also connected to storage manager 201. NAS 250 and file server 254 are dedicated applications without a general purpose operating system and generally do not by themselves support software applications, such as a back-up.
NAS devices typically interface with other components, such as those of storage management system 200, or a relatively limited basis. One reason for this is because NAS devices tend to be proprietary. Accordingly, other storage system designers have a limited knowledge of implementation particulars needed to design fully compatible and integrated interfaces for their products.
Moreover, there are many different types of NAS devices, such as WAFL by NETWORK APPLIANCE of Sunnyvale, Calif., the EMC CELERA file system by the EMC Corporation of Hopkinton, Mass., the NETWARE file system by NOVELL of Provo, Utah, and other vendors. Most of these systems export their file systems to host computers such as the common Internet file system (CIFS) or the network file system (NFS), but provide no mechanism to run software on their operating systems or reside on the file system stack to intercept read/write or other data requests.
One solution to this problem is through the use of a proxy media management component 252 connected to file server 254. Proxy media agent 252 runs the applicable software used to move data to NAS 250. Proxy media management component 252 may, for example, issue commands using the Network Data Management Protocol (“NDMP”).
Referring now to FIG. 3, a representation of a data structure 310 is shown that may be used by system 200 in moving data to NAS 250. As shown, data structure 310 includes the actual data being moved in a payload 314 as well as a NDMP header 312 preceding payload 314 and NDMP trailer 316 following the payload.
As discussed above, index cache 230 in storage manager 200 may keep track of certain information including the status of storage operations. If a storage operation copying data from data store 290 to NAS 250 is interrupted, for example, index cache 230 may be used to restart the operation and may keep track of the data path, data transferred, data remaining, etc. If data from NAS 250 needs to be restored, data in index cache 230 may also be used to facilitate such a restore operation.
One shortcoming of the NAS architecture described above is the vulnerability associated with the dedicated data transfer path which includes proxy 252. For example, if proxy media management component 252 becomes inoperative or otherwise unavailable, there is generally no way to send data to NAS 250. Similarly, if other media management components in the system are handling less of a load than proxy media management component 252, they are unable to assist media management component 252 as it is the sole media management component designated for NAS 250.
Moreover, should storage manager 201 become inoperative or otherwise unavailable, or its data or associated indexes be corrupted, incomplete, or otherwise unavailable, there is generally no way to rebuild index 230 to with data from NAS 250.
Furthermore, with conventional systems, it is difficult to verify the contents of NAS 250 after data is stored thereon. As discussed above, in general, NAS systems are proprietary and a simple request to verify the data stored on a NAS cannot be performed nor can information regarding the data, such as helpful metadata, be made available.
Therefore, it would be desirable to provide a more robust storage operation system that can more effectively interoperate with NAS devices.