1. Technical Field
The present invention relates to data storage and retrieval generally and more particularly to a method and system of generating a proxy for a database.
2. Description of the Related Art
Information drives business. Companies today rely to an unprecedented extent on online, frequently accessed, constantly changing data to run their businesses. Unplanned events that inhibit the availability of this data can seriously damage business operations. Additionally, any permanent data loss, from natural disaster or any other source, will likely have serious negative consequences for the continued viability of a business. Therefore, companies must be prepared to eliminate or minimize data loss, and recover quickly with useable data. One technique used to prevent data loss and enterprise downtime is to maintain a number of additional copies of enterprise data (e.g., via data replication, mirroring, or the like). Such additional data copies may be used in a number of ways.
For example, an additional copy of data may be used to restore data lost when a storage device is corrupted or fails, to verify the consistency or accuracy of data, and/or in a clustering environment for parallel access or failover. Similarly, additional data copies may be used to test or evaluate the impact, if any, of data processing system changes (e.g., software or hardware), and/or operations on the data itself (e.g., changes to data format, updates to the data, or the like). A copy of data may be logical or physical and may reflect the state of data at a particular point-in-time (PIT) or may be updated, synchronously or asynchronously, to correspond with the original data over a period of time. A physical copy of data is an exact duplicate, having identical data stored or arranged in an identical physical storage structure on a storage medium. A logical copy, by contrast, may be accessed in the same way as the original data or a physical copy, but need not contain identical data or have an identical physical storage structure as the original data. Consequently a logical copy may subsume a physical copy in some instances.
Copies or “proxies” of data may also be created in a number of ways. For example, FIG. 1 illustrates a block diagram of a data processing system including a backup/restore utility for generating a copy of a database according to the prior art. Data processing system 100 of FIG. 1 includes a first node 102 including a primary volume 104 used to store a database 106, a secondary volume 108 communicatively coupled to first node 102 and used to store a backup 110 of database 106, and a secondary volume 112 similarly coupled to first node 102 and used to store restored database 114 as shown.
First node 102 further includes application software 116 coupled to a database management system (DBMS) 118 which is in turn coupled to primary volume 104. DBMS 116 may be coupled to primary volume 104 directly, using a file system 120 and/or volume manager 122, or using a backup/restore utility 124 as described further herein. A database management system (DBMS), or “database manager”, is a program that lets one or more computer users or applications such as application software 116 create and access data in a database. A DBMS manages requests so that users and other application software programs are free from having to understand where the data is physically stored and, in a multi-user, multi-processing, or parallel processing system, what other entities may also be accessing the data.
In handling requests, a DBMS ensures the integrity of data (that is, making sure it continues to be accessible and is consistently organized as intended) and security (making sure only those with access privileges can access the data). Some examples of personal computer platform relational DBMSs are Access® and SQL Server® provided by Microsoft Corporation of Redmond, Wash., DB2® provided by International Business Machines (IBM) Corporation of Armonk, N.Y., ORACLE DBMS® provided by Oracle Corporation of Redwood Shores, Calif., and Sybase Adaptive Server® Enterprise provided by Sybase Corporation of Dublin, Calif.
Backup/restore utility 124 may be implemented as an independent element as illustrated in FIG. 1, or may be incorporated into one or more other elements (e.g., DBMS 118) of data processing system 100. Using backup utility 124, a logical copy of database 106 can be created by first generating backup 110 on secondary volume 108 and then processing or “restoring” backup 110 to create restored database 114 on secondary volume 112. As backup 110 is restored prior to being accessed, it is not considered a logical copy of database 106 such as restored database 114, which may be accessed in the same manner as the original data it replicates. While backup 110 and logical copy 114 have been illustrated as being generated on separate secondary volumes this need not necessarily be the case. In other data processing systems according to the prior art, backup 110 and/or restored database 114 may be generated on a single secondary volume or within primary volume 104.
Where the original database is large, backup and/or restoration can take considerable amounts of time and resources to complete, making the illustrated technique for creating a data or database copy undesirable in a number of situations. Similarly, as logical names (e.g., a database name, table space name, or partition number) are specified in a conventional request to create a backup such as backup 110, a user implementing backup/restore utility 124 may not be required to have any knowledge of the physical components or structure of database 106. For the same reason however, backup/restore utility 124 may not be integrated with any other utilities or data management resources which require knowledge of such components and, consequently, may not take advantage of any newly-developed data management applications or utilities.
Another technique used to create one or more copies of data relies on data volume mirroring. FIG. 2 illustrates a block diagram of a data processing system for generating a copy of a database using a split mirror according to the prior art. Data processing system 200 includes a first node 202 including a primary volume 204 used to store a database 206 and a secondary mirror volume 212 coupled to the first node 202. First node 202 of the embodiment of FIG. 2 further includes application software 216 coupled to a DBMS 218, which is in turn communicatively coupled to primary volume 204. DBMS 218 of the illustrated prior art embodiment may be coupled to primary volume 204 using a file system 220 and/or volume manager 222 or directly (not shown) as described further herein.
A “split mirror” is a point-in-time copy of one or more disk volumes that can be attached to the same or different node as the disk volume(s) being mirrored. A split mirror is generated by first creating a mirror by “mirroring” or copying write operations or “updates” performed on one volume to another secondary volume. This copying or duplication can be done as writes to the “mirrored” volume are received, or alternatively following a suspension of activity (e.g., suspension of database 206) on the mirrored volume. Where mirroring is performed following a suspension of activity on the mirrored volume, writes or updates received following suspension may be stored in system memory without being written to the mirrored volume's persistent store. Consequently, write operations or updates received when activity is suspended on the mirrored volume will not be copied to the volume's mirror. The created mirror is then “split off” by ceasing to copy write operations or updates, thus creating a PIT image of the mirrored data volume at the point in time when write operation or update mirroring ceases. While split mirror creation in the illustrated embodiment of FIG. 2 is performed using volume manager 222, mirroring and/or split mirror creation may be performed in hardware or software by any of a variety of data processing system elements (e.g., application software 216, DBMS 218, file system 220, or the like).
While database copying operations which utilize split mirrors may not require time and resource-intensive backup and restore operations, the use of split mirroring to duplicate data suffers from a number of shortcomings. For example, mirroring is performed at a volume level requiring that a copy of an entire volume be made even if a copy of a comparatively small amount of data (e.g., one or more small components of a database, or a database containing little data) is to be made. Consequently, available storage space may be wasted and the number of databases for which copies may be made may be limited. Additionally, integration of mirroring and split mirror copies with other data storage and management applications and utilities in existing systems is limited.