Systems and methods have been used for some time for storing computer data files, for example, in a digital format. Computer data files have been stored in random access memory (RAM), punch cards, on tapes, diskettes, compact disks (CD's), flash memories and many other media. Today, large quantities of information are stored in the form of computer data files. Storing information in computer data files generally results in the information being easier to retrieve and easier to search and manipulate using computers, and requires less storage space than other systems and methods for storing information.
Unfortunately, even the best systems and methods of storing computer data files are not completely reliable. Data can be lost due, for example, to failure of the computer storage medium, operator error, software problems or viruses, or loss or destruction of the storage media. As a result, in addition to storing computer data files in a primary storage location, computer data files have been stored in a secondary storage location to prevent loss of the files in the event the primary storage location is damaged or lost. This is known as a backup system. Computer data files may be stored or archived in a backup system periodically, for example. Backup systems have been used that store computer data files at a remote location accessed via a network to protect the computer data files even if the entire facility is destroyed where the primary storage is located. Since computer data files, or parts thereof, may be changed or deleted by users, information may be lost by being deleted or overwritten. To preserve such information, different versions of the same data files have been archived.
Computer data files may be very large and it may take a lot of memory to store many large computer data files, especially if multiple versions of each file are preserved. In addition, if computer data files are stored off site, it may take a lot of network bandwidth to transmit computer data files for backup archival. Various systems and methods have been used to reduce the memory and network bandwidth required to store backup computer data files. For instance, a checksum may be used to determine whether changes have been made to particular files or blocks of information, and after being saved once, a new version may not be created if the previous version has the same checksum. A checksum may be, for example, the sum of the digits in the digital data file or the result of other mathematical computations on the numerical values of characters in the digital data file.
Another method that has been used to reduce the amount of data that must be stored or transmitted, is to store data representing changes that have been made to a file rather than storing multiple complete versions of the same file. Thus, when a backup file is retrieved, the first version is retrieved, and then the changes for the different versions are made until the desired version of the file is obtained. Such a system is called an incremental or differential backup system. Examples of such systems and related technology are described in U.S. Pat. No. 6,629,110 (Cane et al.), U.S. Pat. No. 6,513,050 (Williams et al.), and U.S. Pat. No. 6,542,906 (Korn) which are all incorporated herein by reference. Further, various methods of file compression have been used to reduce the size of files that are stored or transmitted. Although such systems and methods reduce the amount of data that is transmitted and stored, it may take more computer processing time and capacity to restore files that are stored as a number of changes to a base file.
Further, it is desirable to keep confidential at least some information contained in computer data files. When data is transmitted over a widely used network, such as the Internet, the confidentiality of the computer data files may be jeopardized. To protect the confidentiality of such information, various forms of encryption have been employed. Encryption may utilize a key to encrypt and decrypt computer data files. Encryption has been used in conjunction with backup systems.
Backup systems exist for many computer data files today, requiring a large amount of storage space, network bandwidth, and computer computational time. Thus, needs or benefits exist for storage and backup systems and methods that are more efficient. Benefits of improved systems and methods may include requiring less storage space, requiring less information to be transmitted, reducing disk or computer activity, or a combination thereof.