1. Field of Invention
The present invention relates to a device, a program and a method which retrieve desired document data from among a plurality of document data having different creation dates or update dates. More particularly, the invention relates to a data management device, a document data retrieval device, a data management program, a document data retrieval program, a data management method, and a document data retrieval method which are well suited to grasp featuring parts from among a huge amount of data, which are easy of enhancing the reliability of extraction, and which can immediately comply with a user's request.
2. Description of Related Art
In an enterprise or the like, the progress situation of business is sometimes controlled by causing employees to submit daily business-records. Reports based on the daily business-records are often checked in such a way that one supervisor looks through the daily business-records submitted by a plurality of subordinates, one by one.
The supervisor, however, cannot look through all the submitted business-records every day without fail for reasons of their other duties. Besides, granted that all the daily business-records are looked through, the amount of graspable information is inevitably limited within a restricted time. Accordingly, in a case where the quantity of the daily business-records to be checked becomes huge, it is very difficult to efficiently control the progress situation of the business.
In such a case, in order to efficiently control the progress situation of the business, the supervisor needs to efficiently obtain information from the voluminous daily business-records. Therefore, the property of the daily business-record will be first studied. The daily business-record can chiefly contain the daily business report of each employee, so that many parts ought to repeat in contents when daily business-record records of near creation dates are compared as to the daily business-record submitted by the identical employee. It can be inefficient to look through the parts repeating in contents every day. Accordingly, the supervisor can obtain information comparatively efficiently by grasping the repeating contents only once and grasping only featuring parts (that is, parts having changed) in the subsequent daily business-records.
As one solution to this problem, it is possible to propose, for example, the structure which can accumulate the daily business-records in document database (hereinbelow, the database shall be simply abbreviated to “DB”) as document data, and which can retrieve only the featuring parts from within the document DB.
Heretofore, as a technique for retrieving desired document data from among a plurality of document data, there has been, for example, a retrieval method utilizing the temporal change of a word specification pattern as disclosed in Japanese Laid-open Patent Publication JP-A-7-325832. Besides, as related techniques, there have been, for example, an inference device disclosed in Japanese Laid-open Patent Publication JP-A-6-324871, and an example-based retrieval system creation support device disclosed in Japanese Laid-open Patent Publication JP-A-5-53814.
In the first example, a feature data extraction unit extracts feature data, which express the temporal changes of the word using patterns, from text information beforehand. When a user gives a retrieval input, an input processing unit translates the user's retrieval input into a representation form which can be interpreted by a retrieval processing unit, and it sends the translated input to the retrieval processing unit. The retrieval processing unit performs retrieval by utilizing the text information and the feature data, and the result of the retrieval is sent to an output processing unit and is displayed to the user. Various statistics, for example, the probability of occurrences of each word in the text information can be employed as the feature data.
Thus, the utilization of the feature data extracted from the time series text information permits the retrieval for a word, information or the like having become a topic in a specified field or term, and high quality a trend analysis can be easily made.
In the second example, a logical vector transformation unit transforms rules stored in a rule storage unit, examples stored in an example storage unit, and inference conditions inputted from an inference condition input unit, into a rule vector, an example vector, and a condition vector which are logical vectors, respectively. An indefinite element addition unit adds indefinite elements to the rule vector and the example vector, thereby to turn them into an indefinite rule vector and an indefinite example vector, respectively. Besides, a result vector calculation unit calculates and turns the logical product of the indefinite rule vector, the indefinite example vector, and the condition vector into a result vector. A logical proposition transformation unit transforms the result vector into an indefinite logical proposition. An indefinite element removal unit removes the indefinite elements from the indefinite logical proposition, thereby to produce a definite logical proposition. A logical proposition output unit outputs the definite logical proposition.
Thus, an inference of excellent inference efficiency and low burden for knowledge acquisition can be made.
The third example can retrieve similarities by dividing an example into a plurality of parts. A vector division unit and a sub-vector similarity computation unit are associated. Addition operations attendant upon subvectorized representations are possible. Besides, an alteration monitor function and an alteration comparison function are realized so as to be used when the performance of a system is gradually enhanced.
Thus, the indispensable functions of a creation environment required for erecting the example-based inference system can be offered.