A server which has a file storage function of storing various files at predetermined locations, and a file retrieval function of retrieving a given file upon inputting only keyword data as needed is generally known.
As the concern about security is growing in recent years, files to be stored in the server are increasingly encrypted. If this is done, upon exchanging a hard disk due to its failure, the risk of leak of the file contents due to bringing out of the hard disk can be reduced.
For this reason, in recent years, a server which has a file encryption function of automatically encrypting the contents of a file upon writing the file is starting to appear (see, for example, Japanese Patent Laid-Open No. 10-260903). The file encryption function in such server, keyword data generated by the file encryption function, and a file retrieval function will be briefly explained below.
<Functional Block Arrangement of File Encryption Function>
FIG. 2 shows an example of the functional block arrangement of the file encryption function in the server. Referring to FIG. 2, a file encryption unit 200 encrypts a file. A volume disk 220 stores the encrypted file. A keyword disk 230 stores a keyword included in the file to be stored. Note that the example of FIG. 2 uses an NAS (Network Attached Storage) as each of the volume disk 220 and keyword disk 230.
A file write request 210 to the file encryption unit 200 is input to a file name/data separation unit 201, and is separated into a file name part 212 (file header field) and a file content part 211 (file data field). The file content part 211 is input to an encryption unit 203, is then encrypted, and is input to a file name/data combination unit 204. The unit 204 combines the file name part (212) and the encrypted file content part, and writes it in the volume disk 220. The reason why the file name part 212 is not encrypted is that a file to be backed up cannot otherwise be discriminated upon backing up the file.
Furthermore, the file encryption unit 200 shown in FIG. 2 also extracts a keyword included in a file. The file content part (211) extracted by the file name/data separation unit 201 is also input to a keyword extraction unit 205, and undergoes keyword extraction. A keyword extraction algorithm may adopt, for example, word extraction based on parsing, n-gram extraction, and the like. A keyword extracted by the keyword extraction unit 205 is input to a file name/keyword combination unit 207, is combined with the file name part separated by the file name/data separation unit 201, and is stored in the keyword disk 230.
<Configuration of Keyword Data>
The configuration of the keyword data generated using the file encryption function in the server will be described below. FIG. 3 shows an example of keyword data stored in the keyword disk 230. Keyword data 300 stored in the keyword disk 230 have, e.g., a table format, which stores data (302) of a file name (information included in a header field) and line that includes each extracted keyword (301). For example, as can be seen from FIG. 3, a keyword “Tokyo” is stored in the 26th line of file1.txt (see 302). Also, as can be seen from FIG. 3, a keyword “Sapporo” is included in two files, i.e., the fourth line of file3.doc (see 302) and 408th line of ccc.txt (see 303).
<Functional Block Arrangement of File Retrieval Function>
The file retrieval function of retrieving a file encrypted by the file encryption function in the server will be described below. According to the file retrieval function, keyword retrieval processing of a file encrypted by the file encryption unit 200 in FIG. 2 can be implemented using keyword data 300 stored in the keyword disk 230. FIG. 4 shows an example of the arrangement for this purpose. In FIG. 4, since the file encryption unit 200, volume disk 220, and keyword disk 230 have already been explained using FIG. 2, a description of them will be omitted. A file retrieval unit 401 searches the keyword disk 230 using a keyword 410 as an input and returns a result 411 to a client (not shown).
With the above arrangement, the server can automatically encrypt the contents of a file upon storing the file at a predetermined location.
According to the file encryption function of the server, keywords to be stored in the keyword disk 230 remain plaintext. This is because the keyword retrieval processing of the file retrieval function is disabled if keywords are encrypted.
However, if keywords to be stored in the keyword disk 230 remain plaintext, the administrator (i.e., server administrator) of the keyword disk 230 may estimate the contents of files stored in the volume disk 230.
On the other hand, upon encryption of keywords, for example, a “key” may be prepared, and keyword retrieval processing may be executed after the keywords are decrypted using the key. However, in such method, various other problems that pertain to key management such as a problem of key synchronization with the file retrieval unit 401 (the same encryption key must be used), a method of hiding the key, a method of updating the key, and the like are posed. For this reason, it is desirable to encrypt a file by a simple method, and to execute file retrieval processing of the encrypted file by a simple method.