The following relates to the online retail arts, online service provider arts, recommender system arts, collaborative filtering arts, and related arts.
Efficient computation of tensor trace norms finds application in numerous problems, such as recommender systems and other inference problems operating on sparse tensors. The number of dimensions of the tensor (also call the “degree” or “order” of the tensor) is denoted by K herein. A matrix is a tensor of order 2, and similarly a vector is a order-1 tensor. A tensor may also be of order K>2, and there is no upper limit on the possible order of a tensor.
By way of illustration, recommender systems find application in solving numerous problems having a context made up of elements of different types. For example, consider an electronic social network context having the following element types: users (individuals or, in some social networks, entity users); items (e.g., uploaded images, video, et cetera); and tags (e.g., comments or keywords associated to users and/or items). Within this social network context, various recommendation problems can arise. For example: it may be desired to recommend tags (e.g. keywords) for labeling an item; or, it may be desired to retrieve items of interest to a particular user; or so forth. These recommendation problems can be formulated mathematically as a tensor of order K=3 with one dimension listing all users, one dimension listing all items, and one dimension listing all tags. This tensor is sparse, because most possible element-element associations (e.g., user-user links, user-item links, item-tag associations, et cetera) do not actually exist. For example, most items have no tag, a given user is not linked to most items, and so forth.
As another illustrative example, an automated call center has a context including entities such as the customer, the service person, and the time of the call. In this context, an illustrative recommendation problem is to select a service person to handle a call from a given customer at a given time. The problem can be formulated using a sparse tensor of order K=3 where one dimension is the service persons, one dimension is the customers, and one dimension is time (optionally discretized with a chosen uniform or non-uniform granulation). The tensor is sparse because few of the possible (service person-customer-time) tensor elements correspond to actual call data.
As a further illustrative example, certain chemical optimization problems have the context of a set of constitutent components that can be combined in various combinations to produce a chemical of interest. To illustrate, in the development of new ink formulations, various constituent chemicals can be variously combined. (The problem can be further expanded to encompass different types of paper or other media for which the ink may be useful for marking). In such a problem, testing is performed on different ink formulations. However, with even a few possible constituent chemicals it becomes prohibitive to exhaustively test all possible ink formulations. Accordingly, it would be useful to provide a predictive algorithm to estimate the efficiency of new ink formulations for testing by identifying the most promising chemical combinations. Again, the problem can be represented as a sparse tensor, here of order K equal to the number of constituent components under consideration for inclusion in the new ink formulation. (If paper type is another development parameter, then K is suitably the number of considered constituent components plus an additional dimension for the paper type). The tensor is sparse because only a few possible formulations have actually been tested.
The foregoing recommender system examples can be generalized to an inference engine in general that operates in a multidimensional space of dimensionality K>2 for which only sparse sampling is available, and for which it is desired to infer values for points in the space that have not (yet) been sampled. Such inference problems go by various nomenclature such as recommendation problems, collaborative filtering problems, data imputation, multitask structured learning problems, multi-dimensional regularization and so forth. Other examples of applications that can usefully employ such inference include personality type profiling based on multi-criterion questionnaires, modeling non-Gaussian interactions by modeling correlation of high orders, computer vision problems solved using tensor decomposition formulations, and so forth.
Recommendation or inference problems operating in a space of dimensionality K>2 can be constructed as a likelihood estimation that minimizes a loss function between the sparse observation tensor (denoted herein as tensor Y) containing the available data (e.g., the actual user-user links, or logged call center data, or tested ink formulations) and a prediction tensor of the same order and size (denoted herein as prediction tensor X). This minimization can be written as min l(X; Y) where the loss function l(X; Y) is preferably strictly convex, which implies a single minimum, to ensure a unique minimum and computational efficiency. In practice, however, it is found that the likelihood estimation can be adversely affected by sparseness of the observation tensor Y and/or noise in the observed elements of the observation tensor Y.
It is known to instead perform a regularized likelihood estimation of the form min{l(X; Y)+λ∥X∥} where λ∥X∥ is a regularization or penalty term, ∥X∥ is a tensor norm, and λ is a tuning parameter selected (e.g., by cross-validation) to prevent overfitting. The tensor norm ∥X∥ should again preferably be convex. However, existing formulations of the tensor norm are problematic, as they typically are computationally arduous and/or are not convex optimizations.