In an information processing system periodic archival of static, unused objects is desirable to optimize access to more active items and to guard against failure such as disk head crashes and human error such as accidental deletions. Consequently, periodic backups to magnetic tape and corresponding purging of selected files from online disks is a common practice.
Data archival mechanisms need to assure the integrity of data stored thereby. Users of the data need to know data is persistent, and also that there is a reasonable turnaround time for retrieval. Often this entails copying such data entities, hereinafter files, to an inexpensive, high volume, but not necessarily fast access, form of physical storage such as magnetic tape. Corresponding index information regarding the magnetic tape location of a particular file can be retained online. Since index information referencing a file consumes much less storage than the file itself, such information is not as unwieldy as the actual data file counterpart. In order to retrieve a file, the index is consulted to determine the physical volume of the corresponding file. The physical magnetic tape volume is then searched for the desired entity. Although sequential, this aspect of the search can be performed within a reasonable time since the indexing system has narrowed the field to a single volume. Such indexing schemes are numerous and are well known to those skilled in the art.
Images written to magnetic tape, however, remain fixed and readable unless physically overwritten. Successive revisions of backups tend to render the previous versions obsolete, although the earlier versions still exist on the tape. Such a tape might well be discarded, thereby placing it in the public domain, or partially used for another purpose, leaving an uncertain status of the information which may exist randomly and unprotected. Further attenuation of control over the data occurs when another party performs the archive. Since the archiving operation usually bears little relation to the generation of the data, it is often desirable to delegate this operation. The archive operation may be undertaken by a co-located group, a group at a remote location of the same organization, or an external contractor, and could involve either electronic or physical mediums of data transmission. Delegation of the backup operation to an archive server, however, raises issues of security and privacy, since the corporation or individual generating the data (hereinafter source organization) has little control over access to the data at a remote facility. With regard to file deletion, however, magnetic tape does not lend itself well to selective rewrite. Due to the sequential nature of magnetic tape, intra-tape modifications can compromise subsequent files. It is therefore difficult for an archive service to ensure integrity of data upon retrieval requests, provide effective deletion of obsolete data, and maintain secrecy of data while under the control of the archive mechanism.