The following relates generally to methods, apparatus and articles of manufacture therefor, for estimating parameters of a probability model that models user behavior of shared devices offering different classes of service for carrying out jobs. Once its parameters are estimated, the probability model is used in applications for detecting outliers, evaluating shared infrastructure needs, and initializing configuration settings.
Shared devices, such as multifunctional devices (e.g., devices with functions for printing, scanning, and/or faxing), commonly operate today as a network resource that is shared amongst a plurality of users, in for example, an office or a home environment. Such shared devices offer advantages over dedicated devices (e.g., a device to which access is limited to a user) by possibly offering a wider range of service classes that may vary in operating cost (e.g., TCO—total cost of ownership), quality, and performance, as well as, redundant services in the event of failure.
System administrators managing shared devices commonly collect information about how an infrastructure of shared devices is used. Such information may be presented to system administrators through statistics that identify information such as the total number of functions performed (e.g., total number of pages printed), which may be filtered by individual devices or groups of devices (e.g., devices having the same range of functionality, operating cost, quality, performance, etc.) or geographical location (building, work unit, etc.). Further, such information may be used by system administrators to identify or anticipate problems, anticipate changing user needs, provide assistance to users, and provide initial configuration settings.
While many shared devices record usage job data (e.g., print job logs) that include data that identifies a user identity attached to each requested job, the use of such recorded usage information by system administrators managing the shared devices generally tends to be either device-centric (i.e., focused on aspects about the device) or user-centric (i.e., focused on aspects about the user). Such device-centric or user-centric views may fail to consider other aspects forming part of the recorded usage data of shared devices, such as possible correlations between the two. For example, such device-centric and user-centric views may not take into account the attributes of users sending jobs to devices and the class of jobs performed on the devices.
In accordance with the disclosure herein, recorded device usage data is analyzed using a probabilistic latent model. The model characterizes each job using two observed variables (i.e., users and devices) and two latent variables (i.e., job clusters and job service classes). To carry out such an analysis, device and user information should be correlated and users should not be strongly constrained in their use (e.g., any user is allowed to print anything on any device in a device infrastructure). In one embodiment, once the parameters of the model are estimated, communities of device usage may be discovered, and, from these, suppositions concerning actual behavior of the users may be formed, both in the case of normal infrastructure operations and in case of exceptions (e.g., device down or not operating properly). In another embodiment, community and user information may be used to evaluate the organization of the infrastructure and to provide a set of initial conditions for a given user.
In accordance with the various embodiments disclosed herein, there is provided a method, apparatus and article of manufacture therefor, for estimating parameters of a probability model that models user behavior of shared devices offering different classes of service for carrying out jobs. The method comprises: recording usage job data of observed users and devices carrying out the jobs; determining a range of service classes associated with the shared devices; defining a probability model with an observed user variable, an observed device variable, a latent job cluster variable, and a latent job service class variable; selecting an initial number of job clusters; learning parameters of the probability model using the recorded job usage data, the determined range of service classes, and the selected initial number of job clusters; and applying the learned parameters of the probability model to evaluate one or more of: configuration of the shared devices, use of the shared devices, and job redirection between the shared devices.