1. Field of the Invention
The present invention generally relates to a method, system, program, and device for efficiently generating a file list having search indices to be updated, by efficiently analyzing the layers in a large amount of file data stored in a file server. More particularly, the present invention relates to a method, system, program, and device for efficiently creating a list of added, changed, or deleted file data by comparing file trees of file groups located in two existing directories created in accordance with a common naming rule.
2. Background Art
As the speed of computer performance has become higher, and the capacities of HDDs have become larger in recent years, a huge number of unstructured documents are being created. Therefore, there is an increasing demand for search systems that are capable of accurately retrieving required documents from an enormous number of documents at high speed. To achieve an accurate search result, it is critical that the adding, changing, and deleting operations performed, after the search index creation, on the file data in a file server storing search target unstructured documents be timely reflected by the search indices. In causing the search indices to reflect such operations, a long period of time is required if the search indices about unchanged file data are also updated. Therefore, only the search indices about the file data that have been added, changed, or deleted are normally updated. To do so, it is necessary to create a list of file data that have been added, changed, or deleted.
To satisfy the demand for such search systems, there are file servers each including an interface that stores the histories of operations performed on file data, and provides a list of added, changed, or deleted file data in response to a request from outside. Some other file server provides an interface for holding the file data state at a certain point of time as a “snapshot” in a separate directory, so that a file tree in a past can be accessed.
One of such conventional arts is disclosed in JP Patent Publication (Kokai) No. 2006-268456A.
When a list of added, changed, or deleted file data is created, such an interface can be used if the file server provides a list. However, in the case of a file server that does not include such an interface, all the file data in the search index creation target range existing in the file server need to be scanned to determine whether to perform an updating operation.
Even if the amount of added, changed, or deleted file data is small, all the file data need to be scanned, and therefore, the operation to create a list of added, changed, or deleted file data leads to prolongation of the index updating operation.
To counter this problem, there has been a suggested technique by which the file tree structure in the file server is divided, and scanning operations for those sub trees are performed in parallel, so as to realize a high-speed scan.