1. Field of the Invention
The present invention relates to an information processing technique for extracting a similar operation pattern as a work flow from data operation history.
2. Description of the Related Art
A keyword matching method and the like are conventionally used as methods with which a user searches for a desired item. However, these conventional methods put a large burden on the user. In place thereof, a recommendation method is proposed for automatically searching for necessary items and presenting them to the user to save user's trouble.
One famous recommendation method is cooperative filtering, which is widespread among EC sites and the like. This is for extracting similar users whose tendency of use of items from past use history, and predicting recommendable items, using the use history of the similar users.
However, items required for business use are not only information that serves as information source for preparing information, such as internal documents and web documents. Information serving as some kind of know-how, such as processes for achieving certain jobs and methods for effectively carrying out jobs, is also searched for. There is no problem in searching for such information if it is empirically organized and clearly documented as a work flow, but if it is not clearly documented, time and trouble will be taken for searching for the information.
For this reason, various techniques are proposed for extracting a frequently appearing partial data string as a work flow from chronological use history and using it for recommendation.
Meanwhile, techniques are proposed for providing a label, a keyword, a name, or the like to a set of certain documents or the like that are classified by clustering or the like, such that what this set represents can be easily understood. The labels or the like provided to each cluster is used to narrow a search range and to classify a search result such that it can be readily checked.
Conventionally, feature words are extracted using TF*IDF or the like from words that constitute each document included in a cluster, and one or more labels or keywords to represent the cluster are determined, based on their importance. For example, in Japanese Patent Laid-Open No. 2005-63298, a score of a cluster label is calculated using the importance of words in documents included in a cluster and an inclusive relationship between the words, and one or more representative labels or keywords are determined. In Japanese Patent Laid-Open No. 2008-84203, a label is determined using knowledge data regarding the importance of words in documents included in a cluster and a parallel relationship between the words.
Conventionally, operations were only recommended based on the extracted work flow, and the meaning of the operation and the purpose of recommendation of the operation were not recognized. Therefore, it was difficult for a user to select an operation from a plurality of recommended operations.