To meet business objectives for database availability and to comply with legal and regulatory requirements for data retention, it is becoming increasingly important to maintain a highly reliable backup and recovery infrastructure. Additionally, for scalability in modern enterprise environments where large data sets from multiple databases must be serviced concurrently, it is also crucial to minimize performance overhead and to optimize hardware resource utilization.
For cost effectiveness, it is prudent to implement a multi-tiered storage strategy where data is archived on a lower cost storage tier, such as magnetic tape. In this manner, the total cost for storage media may be reduced while meeting data retention requirements. On the other hand, supporting archival storage tiers will incur additional costs in the form of hardware infrastructure. In the case of magnetic tape, a significant investment in tape libraries is required, which may include tape drives, tape cartridge slots, and robotic loaders. Since it may not be cost effective to provide individual tape libraries for each database, a typical configuration may provide shared access to a tape library over a network.
As tape management software is generally installed separately on each database for a shared tape library configuration, administration is fragmented and backup coordination between multiple databases is difficult or impossible. Accordingly, database administrators must carefully and manually plan, manage, and maintain tape backup schedules for each individual database to avoid contention at the tape library. To avoid performance impacts and downtime at production database systems, it is desirable to complete the tape backup operations as quickly as possible during the most idle database time periods when processing and I/O resources are available. However, since this idle time period is typically around the same time for all the databases, the tape library must be over-provisioned to avoid contention and to complete backups within their scheduled backup time windows.
Accordingly, the hardware utilization of the shared tape library is suboptimal, as the tape library may be idle for a majority of time outside of the scheduled backup time windows. Furthermore, since the tape backup utilization schedules are manually set for each database, coordination between the databases is non-existent or left to the manual adjustment of a database administrator, who may be unable to accurately weigh the competing backup workload demands required from each database. Since the tape management software is often a general purpose software that operates at the file system level, actual changes to the database may only be captured at the file or file allocation unit level, leading to extraneous backups of unchanged data and suboptimal database restore procedures with lengthy mean times to recovery. Adding support for new databases and evolving patterns of data access may also require corresponding changes to the tape backup schedules, presenting a heavy and continuous administrative burden.
Based on the foregoing, there is a need for a method to provide a highly optimized and cost efficient tape backup infrastructure suited to the needs of databases while imposing minimal administrative burdens and production database performance overheads.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.