Not Applicable
The present invention relates to data archive operations for information processing systems, and more particularly to security features for such operations.
In an information processing system periodic archival of static, unused objects is desirable to optimize access to more active items and to guard against failure such as disk head crashes and human error such as accidental deletions. Consequently, periodic backups to magnetic tape and corresponding purging of selected files from online disks is a common practice.
Data archival mechanisms need to assure the integrity of data stored thereby. Users of the data need to know data is persistent, and also that there is a reasonable turnaround time for retrieval. Often this entails copying such data entities, hereinafter files, to an inexpensive, high volume, but not necessarily fast access, form of physical storage such as magnetic tape. Corresponding index information regarding the magnetic tape location of a particular file can be retained online. Since index information referencing a file consumes much less storage than the file itself, such information is not as unwieldy as the actual data file counterpart. In order to retrieve a file, the index is consulted to determine the physical volume of the corresponding file. The physical magnetic tape volume is then searched for the desired entity. Although sequential, this aspect of the search can be performed within a reasonable time since the indexing system has narrowed the field to a single volume. Such indexing schemes are numerous and are well known to those skilled in the art.
Images written to magnetic tape, however, remain fixed and readable unless physically overwritten. Successive revisions of backups tend to render the previous versions obsolete, although the earlier versions still exist on the tape. Such a tape might well be discarded, thereby placing it in the public domain, or partially used for another purpose, leaving an uncertain status of the information which may exist randomly and unprotected. Further attenuation of control over the data occurs when another party performs the archive. Since the archiving operation usually bears little relation to the generation of the data, it is often desirable to delegate this operation. The archive operation may be undertaken by a colocated group, a group at a remote location of the same organization, or an external contractor, and could involve either electronic or physical mediums of data transmission. Delegation of the backup operation to an archive server, however, raises issues of security and privacy, since the corporation or individual generating the data (hereinafter source organization) has little control over access to the data at a remote facility. With regard to file deletion, however, magnetic tape does not lend itself well to selective rewrite. Due to the sequential nature of magnetic tape, intra-tape modifications can compromise subsequent files. It is therefore difficult for an archive service to ensure integrity of data upon retrieval requests, provide effective deletion of obsolete data, and maintain secrecy of data while under the control of the archive mechanism.
The present invention addresses the problem of privacy for archived data by providing the source organization with control over the data without burdening the reliability of retrieval with the problems caused by sequential overwrite. An encryption function applied to the archived data renders it in a form unintelligible to unauthorized observers. Encryption involves arithmetic manipulations of the data using a specific value called a key, which renders the data in an unintelligible form. This key bears a specific mathematical relationship to the data and the encryption algorithm being used. Returning the data to the original form involves applying the corresponding inverse function to the encrypted form. Without the proper key, however, it is very difficult to determine the inverse, or decryption, function. The security provided by encryption rests on the premise that with a sufficiently large key, substantial computational resources are required to determine the original data. Encrypting a file with a particular key, and then encrypting the key itself using a master key, therefore, allows another party to physically maintain and store the data while the originator, or source, of the data retains access control. Additional security and authentication measures can also be taken, such as further encrypting the key or the data at the server with a server key, and the use of cipher block chaining to impose dependencies among a sequence of file blocks.
In accordance with the present invention, an archive server utilizes encryption techniques to maintain both security and integrity of stored data by maintaining a series of keys for each archived file, and encrypting both the archived file, and the key to which it corresponds. The archive server manages the encrypted files and the corresponding encrypted keys, while the source organization maintains only the master key required to recover the individual encrypted keys. Through this arrangement, the source organization maintains control and assurances over access to the archived data, while the archive server manages the physical storage medium and performs individual encrypted file manipulation requests at the behest of the client. The archive server maintains access only to the encrypted data files and encrypted keys, effectively managing these files and keys as abstract black-box entities, without the ability to examine and interpret the contents.
Three common transactions involving archived encrypted files are effected by the present invention. A source organization desiring to archive files periodically transfers files from its online repository, usually a fast access storage medium such as a disk, to the archive server. To retrieve archived information, a retrieval transaction indicating a particular file occurs. Finally, when an item is to be deleted, a deletion instruction implicating a particular file is issued to the archive server.
One benefit provided by this arrangement is the elimination of access to data by the archive server, therefore providing the source organization with assurances of access control and privacy, while relieving the source organization of archive cataloging and physical storage duties. Furthermore, effective deletion of information stored on archive tapes is achieved without physical modification to magnetic tape, therefore avoiding compromise to subsequent data on the same volume.