The present invention is associated with computer primary disk storage systems and the ongoing protection of data on that primary disk storage system from various forms of data loss or corruption. These forms of data loss include accidental user or application file deletion, virus attacks, hardware failure and/or the loss of a data center facility.
Primary disk storage systems must be periodically protected. While there are numerous data protection solutions available for protecting data, they suffer from the following problems:                There are many data protection schemes that are employed in combination to fully protect data. The multiplicity of these schemes (e.g. RAID, snapshots, backup, replication) creates over-replication of primary data and increases the complexity of data recovery administration.        The deployment of data storage and data protection systems today rarely extends beyond a single data center. This creates isolated islands of data storage, data protection and data management. This creates an environment where some data centers have a surplus of storage capacity that they cannot effectively share with other data centers that need additional storage capacity.        Traditional data protection systems rely almost exclusively on magnetic tape because of its low cost. There are significant reliability and long-term integrity issues associated with reading data that has been recorded on magnetic tape. Tape media quality is degraded each time it is used in a tape drive due to friction between the medium surface and the tape drive head. In addition, tape media that is stored in an archive facility must be maintained within tight environmental limits of temperature and humidity. These limits are likely to be exceeded as tapes are transported from a company's air-conditioned data center into archive storage trucks and then back into air conditioning of the offsite storage facility. Observe the relatively limited temperature and humidity range for magnetic tapes that are stored in an archive environment.        
Magnetic TapeMagnetic DiskArchive Temp (C.)18-28 degrees C.−40 to 65 degrees C.Archive Humidity (%)40-60%5-95%                Generations of tape media and tape drive technology regularly become obsolete, making long-term archiving using magnetic tape a significant challenge. A company with hundreds or thousands of magnetic tapes that were written on older generation tape drives must maintain one or more of these older tape drives to be able to access data on these older tapes.        New computer applications and types of digital data are causing a 60% year over year increase in demand for primary disk storage capacity. While magnetic disk technology has kept pace with the demand for providing cost-effective, high capacity primary storage systems, magnetic tape has not. In 1986, magnetic tape was approximately 35 times less expensive than magnetic disk, but that cost advantage has eroded from 35× to approximately 2× at the present time. It is expected that this cost erosion will continue into the future, eventually making magnetic tape a more costly alternative to magnetic disk storage.        Currently, each data storage system is made up of a collection of a dozen separate data storage, data protection and data management system and software components. Such systems experience interoperability problems among components. Each of the many components typically has its own management user interface that needs to be mastered by a storage administrator.        With the multiplicity of data protection systems and components, such as RAID, snapshots, tape backup and file and block replication, it is difficult for a storage administrator to know how best to respond in the event of actual data loss.        With today's data storage and data protection systems, one megabyte of primary storage data can generate from 10 megabytes to 50 megabytes of protected data. This over-replication of data comes from RAID redundancy drives, snapshot histories, multiple sets of weekly full backup tapes, daily incremental backup tape sets and block and file replication systems. For example, a company that retains just 3 months of weekly full backups will have replicated the data from the primary storage system about 13 times, since the data on successive weekly full backups is almost completely identical.        Most disaster recovery systems in place today employ replication between just two specific storage subsystems. They don't provide a logical abstraction of virtual storage capacity to enable any primary storage resource to be protected by any other local or remote protection resource.        There are many data replication products that are available today. Replication products, as they've been designed, replicate all changes between two systems. However an accidental deletion of a file from one of the systems in the replication set will cause the deletion to occur at the other replicated system(s). When this occurs, the data that was deleted must be recovered from backup tapes. Therefore, today's replication systems continue to rely on magnetic tape based backups for complete protection.        Snapshot-based data protection has become popular since it provides end-users with the ability to recover files that they have deleted in the recent past. But snapshot systems cannot function as a replacement for traditional tape backup. Snapshots depend on the current version of a filesystem to be operational in order to recover earlier snapshot versions of files. Therefore, today's snapshot-based systems continue to rely on magnetic tape based backups for complete protection.        Standard weekly-full/daily incremental tape backup schedules today are designed around the long search times of traditional magnetic tape. During a data restore operation, a full tape is first loaded and then a number of incremental tapes must be loaded thereafter. It takes tens of minutes to search and recover the desired data item on each tape, so the standard weekly full backup model limits the number of tapes used in recovery to one full tape and at most 5 incremental tapes. If tape media latency could be eliminated from the data recovery process by leveraging the much faster seek and rotational delays of magnetic disk technology, full backups could be performed less frequently. For instance, a full backup may be taken once a month or once a quarter with incremental backups occurring daily between these full backups. When magnetic disk is used as a backup medium, the time to access and recover multiple weeks or months of incremental backup data from disk is thousands of times faster than traditional tape. Weekend full backup runs also strain networks and administrators in getting all of the primary storage data committed to magnetic tape before the weekend ends. As the amount of primary storage data grows, the time it takes to backup all of this additional primary storage grows proportionally.        The value of certain collections of data to the survivability of a company changes over time. For example, a database may start out as a non-critical application yet grow to become mission critical as more of the business depends on it for daily operation. Conversely, a database that was once critical to daily operation of the business becomes less important as it is replaced by newer systems. With current tape-based data protection schemes, it's difficult to increase or decrease the degree of protection that is applied to specific sets of primary storage data as their value to the corporation changes over time, particularly when that data that has already been protected to hundreds or thousands of backup tapes.        While magnetic tape provides good sequential access performance for today's backup software products, its access time to random data is approximately a thousand times slower than magnetic disk. This limits the use of tape to data streaming applications like backup/archiving.        