1. Field of the Invention
The present invention relates to a data processing apparatus, a data processing method, and a computer program product for implementing the same.
2. Description of the Related Art
In recent years, with growing popularity of color scanners and digital cameras, the probability of scanning color documents and processing color image data has increased. Such processing includes, storing, printing, and reusing color image data. Sometimes the color image data is even transmitted to some other device via a network.
Image data is used in various ways depending on the needs. Moreover, image data have various types. These facts make fabrication of a system that can handle the image data optimally difficult.
To address this problem, a data processing apparatus is disclosed, that deals with “diversity” of input devices and applications (user tasks) in, for example, Japanese Patent Application Laid-open No. 2006-053690 and Japanese Patent Application Laid-open No. 2006-074331.
“Objectives” of users are another important factor that has to be considered in terms of “diversity.” That is, even when processing similar document images, different user's objectives require different processing or parameters to be applied. For example, in a document image tone correcting technology, a determination as to whether skin colors are to be changed to white (skin color correction) or only contaminations and offset stains are to be removed while preserving original colors (skin color cleaning) depends on user's objectives.
In conventional data processing systems, when a large number of images are to be processed depending on individual user's objectives in such a manner, a user needs to specify an algorithm or a processing parameter for each sheet of paper one by one. This puts a lot of burden on the user and reduced work efficiency.
To construct a system that can accommodate such “diversity”, a mechanism is required to be built that performs operations described below on site, that is, on a device in operation.
(1) Storing history and (an) event(s) that record a set of a multi-dimensional feature vector representing data content and an algorithm or a processing parameter applied by the user.
(2) Using the stored history and event(s) to learn a function that predicts based on each feature vector how appropriate the algorithm or processing parameter is.
(3) Predicting what should be done (appropriate algorithm or processing parameter to be applied) to unknown data based on the feature vector of the data.
Therefore, constructing the system that can accommodate “diversity” requires a function to, based on the history information and the event(s) (a list including each set of the feature vector representing data content and the applied process and the used parameter), recommend appropriate processing or function(s) (an algorithm or a parameter). For data similar to data previously processed, such a recommending function recommends the same processing to the user as that applied to the previously processed similar data. The user may specify an alternative algorithm or parameter only when the user does not accept those recommended. In this manner, a desirable system learns over time to initially (by default) select an algorithm or a parameter that meets requirements.
Therefore, the present applicant has proposed a method for realizing such a mechanism in Japanese Patent Application No. 2007-18300 and Japanese Patent Application No. 2007-242682.
However, even if a mechanism for providing recommendation to a user can be realized, with insufficient learning of a predictor and thus low prediction accuracy, the user is required to modify the recommendation one by one. Due to the modification, work efficiency may be worse than a system without such recommendation.
Therefore, the prediction for recommendation should be performed as long as the prediction accuracy is high enough to improve the work efficiency by the recommendation.
Note that, because the work efficiency depends not only on the accuracy of the predictor but also on working speed specific to each user, it is not sufficient simply to threshold the predicting accuracy calculated based on the history.