1. Field of the Invention
The present invention relates to an electronic library which is a file management apparatus, and more specifically to a system for sweeping and backing up a file registered in the electronic library for electronically guaranteeing the originality of a document.
2. Description of the Related Art
With the latest development of information processing devices, documents conventionally managed on paper have been managed as electronic data. However, since an information processing device can easily copy or falsify a document of electronic data, there is a strong possibility that an original document cannot be detected in a plurality of documents stored in the device. To solve the problem, an electronic library has been developed as an information processing device for managing an electronic original document.
An electronic library, that is, a device for discriminating an original document from a copy, realizes the originality by having the following functions.
(1) An original document can be discriminated from a copy. The original document can be unique when the original document is moved to another electronic library.
(2) When an original document is falsified, it can be detected.
(3) All operations performed on an original document are stored, and all falsification on the operations are detected. Operations refer to all operations such as registering, referencing, updating, moving, copying, generating a backup, sweeping data to another medium, restoring data, etc. In an updating operation, an entity of each generation is stored.
(4) An access right to an original document is controlled.
In these functions, since a file (electronic documents, etc.) is basically stored in a RAID (redundant array of independent disks) device, the entire volume of a stored file is large, and it is necessary to extend a disk or sweep a part of the file to an external medium before the RAID device becomes full.
When a backup file is generated, and a storage medium (a disk, etc.) containing data of an original file is destroyed, the data is restored from a backup medium and to the state that existed at the time of backup. Obviously, the information after the previous backup process cannot be restored. However, it is necessary to maintain the originality (or the uniqueness of an original document) when data is restored after a backup process when an electronic library for discriminating an original document from a copy is managed.
FIG. 1 shows the sequence of documents to be managed as a unit.
When a document 1 is first generated, it is stored in an electronic library as the first version of the document 1. The first version of the document 1 and the management information about it required by the electronic library to guarantee the originality of the document are managed as a set of information. When the second version of the document 1 is generated by updating the original document of the document 1, the second version of the document 1 and the management information about the second version of the document 1 are managed as a set of information. Similarly, the third version of the document 1 and the management information about the third version of the document 1 are managed as a set of information. Thus, when the document 1 is repeatedly updated sequentially into the first version through the n-th version, the first through the n-th versions of the same document 1 are referred to as one sequence. That is, when a document is changed by sequentially being updated, the history is referred to as a sequence. Therefore, if two documents are changed by being updated, etc., there exist two sequences of documents.
To guarantee the originality of a document, not only a file server, but also the above mentioned specific functions are required. In addition, when an original document is electronically stored, it is normally stored in a RAID device as with a common file server. When the RAID device is full, it is necessary to add a disk device, or sweep an older file into a tape or an MO. Therefore, the following care different from that of a normal file server is required in controlling the sweeping process.
(1) A file is not to be unconditionally swept. If it can be unconditionally swept, different versions of an original document in the same sequence are swept into different media, thereby complicating the management and maintenance.
(2) Swept information cannot be restored on other devices. If it is possible, there arise not only double original documents, but also serious security problems.
(3) It is necessary to detect the falsification performed on an external medium into which a file is swept. Unless the falsification of the external medium can be detected, there occurs the possibility that the source medium is intentionally damaged, and an illegal original document can be generated from a falsified backup medium
(4) When a swept file is used again, it is necessary to clearly manage as to which medium the file has been swept into.
(5) There can be a case in which an external medium becomes full during the sweeping process, and there is no standby medium at hand.
(6) A sweep history file is to be copied in case of a failure of a RAID-disk. In the following case, a disk can be full in the process of restoring a file after the destruction of a RAID device.
(7) A file is not always swept into an external medium. Since the process of sweeping a file into an external medium is performed through an operator, it may be inconvenient to sweep a file online. A destination medium for a swept file can be another electronic library.
FIGS. 2A, 2B, 3A, 3B, and 3C show the cases in which a swept document cannot be restored on a storage medium in a sweeping process.
First, as shown in 2A, documents A through F stored on the first storage medium such as a disk are totally copied on the second storage medium. Then, as shown in FIG. 2B, when a new document G is added, there still remains a space on the first storage medium. Therefore, the document G can be stored.
Then, since the first storage medium has become full as shown in FIG. 3A, the documents A through C are to be swept. At this time, the documents A through C are moved to the third storage medium, and simultaneously the history that the documents A through C have been swept is generated. Next, as shown in FIG. 3B, documents H through J are added in an empty area obtained as a result of sweeping the documents A through C. When the document stored on the first storage medium is to copied, the difference from the previous backup contents is stored. That is, the documents G through J added after the previous backup are copied, and the history that the documents A through C have been swept is also copied.
Assume that the first storage medium, which is a RAID device, has been destroyed. Then, the documents A through J are to be restored on a new storage medium from the backup contents. However, since the new storage medium only has the capacity of the first storage medium which has been destroyed, the documents A through G can be restored, but the documents H through J cannot be restored on the new storage medium (see FIG. 3C).
Also when a file is copied with the originality guaranteed, not only a file server, but also the above mentioned specific functions are required. When the original document is electronically stored, it is normally stored on a RAID device as with a common file server. When a backup copy is made, it is stored on a tape device, an MO device, etc. The updated file obtained after the previous backup cannot be restored when the RAID device is destroyed. To avoid this, it is necessary to make a backup file for each transaction, which incurs considerable deterioration in system performance. However, to restore a file after backing it up, the following problems are to be solved in addition to the problems with a common file server.
(1) When a file is to be restored, it is necessary to completely restore the backup information. Partial restoration cannot guarantee the desired originality. For example, when an original document is updated with only a part of an updated history restored, an operation to be performed on the original document cannot be guaranteed.
(2) When a file is restored using another work medium prepared with a restoration procedure as a countermeasure against the problem described in (1) above, it is necessary to manage on the system side as to which work medium is to be used when there are a plurality of media.
(3) Copied information cannot be restored on another device. Otherwise, there are double original documents, and there occurs a serious security problem.
(4) It is necessary to detect the falsification on an external backup medium. Unless the falsification is detected, the source medium can be intentionally damaged, and an illegal original document can be generated from the falsified backup medium.
(5) After making a backup file, an original document is moved to another electronic library. If the RAID device is destroyed, and the document is restored, then there exist double original documents in the source device and the destination device.
(6) It is necessary to flexibly set a backup art timing, and reduce the deterioration of performance in a backup procedure.
(7) In a transaction relating to a plurality of original documents, it is necessary to successfully control such that backup copies can be made for all documents, or no backup copies can be made. This process is required to avoid a plurality of inconsistent files after the restoration of files.
The present invention aims at providing an apparatus and a method for appropriately sweeping a file such as a document, etc. or making a backup file using a file management apparatus such as an electronic library for guaranteeing the originality.
The file management apparatus according to the present invention includes a storage unit for storing an electronic file, a process unit for processing the electronic file, a history generation unit for generating a history file storing the history of the process, and a storage unit for storing the history file.
The method for managing a file according to the present invention includes: (a) a step of storing an electronic file; (b) a step of processing the file; (c) a step of generating a history file storing the history of the process; and (d) a step of storing the history file.
According to the present invention, when an electronic file is processed, for example, when an electronic file is swept and a backup electronic file is made, these histories are stored, and it can be clearly determined which electronic file is an original document. Therefore, only one original document is stored, and can be discriminated from a file stored in a storage unit or a storage medium different from that storing the original document, thereby successfully managing the electronic file without damaging the originality of the original document.