The ever increasing reliance on information and the computing systems that produce, process, distribute, and maintain such information in its myriad forms continues to put great demands on techniques for data protection. Simple systems providing periodic backups of a computer system's data have given way to more complex and sophisticated data protection schemes that take into consideration a variety of factors including: the wide variety of computing devices and platforms encountered, numerous different types of data that must be protected, the speed with which data protection operations must be executed, and the flexibility demanded by today's users.
FIG. 1 illustrates an example of a data backup and recovery system for use in a variety of computing environments, e.g., small business, enterprise, educational, and government computing environments. Computing system 100 includes a number of computer systems (server 120, workstations 130 and 140, backup and restore master server 150, and media server 170) interconnected by network 110. Network 110 can implement any of a wide variety of well known computer networking schemes but is typically a local area network (LAN), e.g., an enterprise-wide intranet, or a wide area network (WAN) such as the Internet. Each of server 120, workstation 130, and workstation 140 include information such as system software, application software, application data, etc., that has some value to users of the computer systems and thus requires some level of data protection.
Information protection within computing system 100 is controlled and coordinated by software operating on backup and restore master server 150. The software operating on the backup and restore master server is the “brains” for all data protection activities and provides, for example, scheduling and tracking of client computer system backups and restorations, management of data storage media, and convenient centralized management of all backup and restoration activities. In the example illustrated in FIG. 1, backup and restore master server 150 can also have one or more storage devices, e.g., tape drive 160, attached directly to the server or through network 110 for backing up and restoring data from multiple clients. In support of such a data protection system, each of the clients, e.g., server 120, workstation 130, and workstation 140, of backup and restore master server 150 typically includes backup and restore client software or agents. Such agents typically receive instructions from backup and restore master server 150 and handle the extraction and placement of data for the particular client computer system. Together, backup and restore master server 150 and the backup and restore agent operating on a client computer system can backup and restore files, directories, raw partitions, and databases on client systems. Such data protection software can also be used to archive and restore logical database data.
FIG. 1 also illustrates another possible component of computing system 100. Media server 170 can be used in conjunction with data intensive applications, such as data warehouses, to locally back up large applications while backing up other client systems over the network. Media server 170 can share a storage device such as storage array 180 (e.g., a tape library) with backup and restore master server 150 or another media server (not shown).
While the system described in FIG. 1 performs data protection duties well, protecting and archiving large amounts of data from specific applications, e.g. database management systems, can pose additional challenges. Typical database backup and database extraction operations include physical backups, logical exports, and ASCII file dumps. These operations are not particularly suited for long-term archiving because they either use proprietary file formats or are not self-describing. Online and offline physical database backups are generated using proprietary tools associated with the database management system. For example, physical database backups of databases created by Oracle Corporation's database management systems can utilize Oracle's Recovery Manager (RMAN) tool to provide quick restore and point-in-time recovery to protect from media failures. However, such techniques use the proprietary vendor file format and must be recovered using the database management system after the data has been restored. Logical exports of relational data using database vendor proprietary tools supplement physical database backups and provide protection from user failure by allowing for the restoration of individual tables. However, logical exports also produce proprietary formats that can only be read by vendor tools. ASCII dump files, e.g., comma-delimited tabular data, are in a non-proprietary format but are not self-describing using a standard method. Vendor tools are available to load data from an ASCII dump file by creating a control file that instructs the specific tool how to load the data, but the format of the control file is not standard across vendor tools.
In addition, the time to produce each type of backup or logical extraction varies considerably with the method. Although one of the most efficient methods for backup and restore is physical database backup followed by proprietary export and corresponding import, such techniques suffer the above mentioned deficiencies. The time to produce an ASCII dump file (and any associated control information) is generally longer than physical backups.
Finally, the amount of space required for each type of backup or logical extraction also varies with the method. The relative size of the backup or logical extraction depends on the percentage of unused blocks in the database, and, in the case of a proprietary export, how much meta-data, is included and whether or not indexes are included.
Accordingly, it is desirable to provide systems and methods for data protection that provide the added flexibility and archival advantages lacking in current systems.