The ever increasing reliance on information and the computing systems that produce, process, distribute, and maintain such information in its myriad forms continues to put great demands on techniques for data protection. Simple systems providing periodic backups of a computer system's data have given way to more complex and sophisticated data protection schemes that take into consideration a variety of factors including: the wide variety of computing devices and platforms encountered, numerous different types of data that must be protected, the speed with which data protection operations must be executed, and the flexibility demanded by today's users.
FIG. 1 illustrates an example of a data protection system for use in a variety of computing environments, e.g., small business, enterprise, educational, and government computing environments. Such data protection systems typically provide functionality for one or more of: data backup, data recovery, data duplication, and data archiving. Moreover, the data manipulated by such systems can include all manner of computer readable information including computer software, image files, text files, database data, and the like. Computing system 100 includes a number of computer systems such as servers 105, 110, 115, and 120, and workstations 125 and 130 interconnected by network 135 and/or directly connected with each other. Network 135 can implement any of a wide variety of well known computer networking schemes but is typically a local area network (LAN), an enterprise-wide intranet, or a wide area network (WAN) such as the Internet. Each of the computer systems typically includes information such as system software, application software, application data, etc., that has some value to users of the computer systems and thus requires some level of data protection.
Information protection within computing system 100 is controlled and coordinated by software operating on master server 105. The software operating on master server 105 is the “brains” for all data protection activities and provides, for example, scheduling and tracking of client computer system backups and restorations, management of data storage media, and convenient centralized management of all backup and restoration activities. Master server 105 can also have one or more storage devices, e.g., tape drives and optical storage devices, attached directly to the server or through network 135 for backing up and restoring data from multiple clients. In support of such a data protection system, each of the data protection system clients (e.g., servers 120 and 115 and workstations 125 and 130) of master server 105 typically include backup and restore client software or agents. Such agents typically receive instructions from master server 105 and handle the extraction and placement of data for the particular client computer system. Together, master server 105 and the data protection agent operating on a client computer system can backup and restore files, directories, raw partitions, and databases on client systems. Such data protection software can also be used to archive and restore logical database data.
FIG. 1 also illustrates other possible components of computing system 100. In general, media server 110 operates under control of master server 105. Data protection administrative functions are performed centrally from master server 105, and master server 105 also controls backup scheduling for media server 110. Media server 110 performs actual data movement operations (e.g., backup operations and restore operations) under direction from the master server, and the data remains local to the media server and its respective storage devices. Variations on this basic scheme are well known in the art. For example, a master server and its associated media servers can be referred to collectively as a storage domain, and large networks may have more than one storage domain. A media server can share a storage device such as a robotic tape library with other devices such as its master server or another media server. Master servers, media servers, and storage devices can be directly connected to one another, connected to each other using a conventional network, or connected to each other using a specialized network such as a storage area network (SAN). Still other devices, such as SAN switches, SAN routers, storage appliances, and/or data movers (e.g., third-party copying devices) can be used in computing system 100. Finally, it should be noted that a master server can also operate as a media server and that master and media servers can treat themselves as data protection clients, e.g., they can backup and restore data from or to themselves.
In the example of FIG. 1, computing system 100 includes SAN 140. SAN 140 can be implemented using a variety of different technologies including SCSI, fibre channel arbitrated loop (FCAL), fibre channel switched fabric, IP networks (e.g., iSCSI), Infiniband, etc. SAN 140 can also include one or more SAN specific devices such as SAN switches, SAN routers, SAN hubs, or some type of storage appliance. Devices such as tape library 150, data mover 155, a group of disk drives 170 (i.e., “just a bunch of disks” or “JBOD”), and intelligent storage array 175 are attached to the SAN. Moreover, SAN specific devices such as SAN bridge 145 can also be part of the storage network. In this example, SAN bridge 145 might be a SCSI to fibre channel bridge providing access to additional devices such as tape drive 160 and optical storage 165.
Protecting the integrity of data as it is moved from one part of a computing system to another is an important aspect of any computer system. Data movement can result from a variety of operations including normal application software operation, data backup operations, data restore operations, and data relocation resulting from system design changes or hardware failures. In many computing systems, data movement is handled by programs executing on servers such as servers 115 and 120. In the case of data movement operations such as data backup and data restore, the use of server resources to handle the data movement means that fewer server resources are available for more typical operations such as application software and operating system overhead. Accordingly, efforts have been taken to move some I/O processing off of system servers to an off host agent. Such agents are often referred to as third-party copy (3PC or TPC) devices or data movers.
Third-party copy operations transfer data directly between storage devices in a SAN or other environment using a third-party copy device, copy manager, or data mover. Computing system 100 includes at least one such device (data mover 155), and can include third-party copy functionality in other forms such as SAN bridge 145 or media server 110. Thus, third-party copy devices can be a separate device as shown; part of a SAN switch, router, bridge, or another SAN network component, part of a server attached to the SAN, or within a storage element such as storage array 175. Third-party copy devices operate on behalf of some other piece of software, e.g., a backup or restore application, to accomplish the third part copy operation.
As sophisticated distributed data protection software is used more frequently in SAN environments, new challenges are faced. In many cases, SAN complexity is compounded by the existence of various zoning and device access restrictions. Moreover, the more storage devices and data movement devices present in the SAN, the more difficult it is to select the most efficient set of devices for a particular operation. Accordingly, it is desirable to have efficient and convenient mechanisms whereby third-party copy devices and their features can be discovered in an automated fashion. Moreover, it is desirable that these efficient and convenient mechanisms provide users with useful information about which third-party copy device among several should be used for a given data movement task.