1. Field of the Invention
The present invention relates to the technology of generating a search index in a system for searching information (file, email, etc.) using a computer.
2. Description of the Related Art
There have been the following two techniques for generating a search index (hereinafter referred to as simply an “index”) in an information search system.
a. Generating an index for each piece of information
It is a technique of generating an index by extracting a keyword and an attribute (hereinafter referred to as a “meta-data”) for each piece of information to be searched. It compares each piece of information with a search feature information (hereinafter also referred to as a “query”) during search, and returns the information satisfying the search feature information. A number of information search systems such as Google (registered trademark), MSN Search (MSN is a registered trademark), etc. generate an index in this method. For example, the patent document 1 (Japanese Published Patent Application No. H11-39293) discloses a technique of automatically extracting a document processed in the current task from among the contents of the tasks of a user, recording the task name and the person in charge of the task, and the document name, thereby searching the document using the recorded meta-data.
b. Generating an index of an information group
It is a technique of classifying plural pieces of information into information groups using predetermined reference numbers and generating an index for each piece of information as disclosed by, for example, the patent document 2 (Japanese Published Patent Application No. H11-143912). An index is generated by extracting a keyword, a document title, etc. from an information group. During the search, an information group is compared with a query, and an information group satisfying a search feature information is returned. Information which does not match the query, but is included in the information group can be searched.
FIG. 1 shows the outline of the conventional apparatus to which the technique b above is applied. As shown in FIG. 1, the conventional apparatus includes a computer (PC) 101 which is used by a user to perform a task and is provided with an information group detection unit 102, an information group database (hereinafter also referred to as an “information group DB”) 103, an index generation unit 104, and an index record unit 105. The information group detection unit 102 classifies the information that is recorded in an information record unit 106 provided inside or outside the PC 101, and can be manipulated by a user into information groups based on the reference number predetermined for each piece of information, and records the data relating to the classified information group in the information group DB 103. The index generation unit 104 generates an index for each information group based on the data relating to the information group recorded in the information group DB 103. In generating an index, a keyword and a document title in the information group are extracted, thereby generating an index. A generated index is recorded in the index record unit 105, and is used in searching an information group.
In the technique a above, the user processes plural pieces of information in the operation. Although the user intends to collectively search the pieces of the information, the system does not prepare an index for each information group, and an information group cannot be searched. In the patent document 1, only meta-data such as a task name, the name of a person in charge, a document name, etc. is recorded and compared, and the contents of documents cannot be processed. Patent Document 1 uses only the sequence of manipulation histories in extracting a task, and no determination is made based on the contents, thereby possibly failing in extracting a task with sufficient accuracy. For example, when a user happens to start a task while performing another task, there can be the possibility that the currently processed information is recorded as the information processed in the other task.
In the technique b above, it is necessary to set in advance a reference number in each piece of information to generate an information group. Since the information without a reference number is not included in an information group, it is not to be searched. In addition, a reference number is fixed, and is not dynamically changed. Therefore, when the use (classifying method on an information group) of information by a user or a viewpoint of a user has changed, it is necessary to reproduce an index by reassigning a reference number. For example, when a user performs a routine task of processing plural pieces of information, the relationship between the pieces of information depends on the routine task. However, since the information group is fixed in the technique b above, there can be the possibility that no information group corresponding to the task exists although a user intends to search for the information on the basis of the task at a specific time point. Without an information group, its index is not existing, and no information relating to the task can be searched.