A large and growing population of people enjoy entertainment or digital media through consumption of digital content items, such as music, movies, books, games and other types of digital content. As more content is made available in digital form, the economic landscape for media creation, production, and distribution is evolving. Electronic distribution of information has gained in importance with the proliferation of personal computers, mobile devices and mobile phones, and electronic distribution has undergone a tremendous upsurge in popularity as the Internet has become widely available. With the widespread use of the Internet, it has become possible to quickly and inexpensively distribute large units of information using electronic technologies.
The rapid growth in the amount of digital media available provides enormous potential for users to find content of interest. Unfortunately, given the difficulty of searching digital media, the size of a digital media repository often makes the discovery of digital media a difficult task. One method of machine learning, Adsorption, provides a method of efficiently propagating information through a variety of graphs and has been applied to digital media to propagate preference information for personalized digital media recommendations.
Adsorption is a general method for transductive learning where the machine learner is often given a small set of labeled examples and a very large set of unlabeled examples. The goal is to label the unlabeled examples, and possibly under the assumption of label-noise, also to re-label the labeled examples. As many other related methods, Adsorption assumes that the learning problem is given in a graph form, where examples or instances are represented as nodes or vertices and edges code the similarity between examples. Some of the nodes are associated with a pre-specified label, which is correct in the noise-free case, or can be subject to label-noise. Additional information can be given in the form of weights over the labels. Adsorption propagates label-information from the labeled examples to the entire set of vertices via the edges. The labeling is represented using a non-negative score for each label, with a high score for some label indicating high-association of a vertex (or its corresponding instance) with that label. If the scores are additively normalized they can be thought of as a conditional distribution over the labels given the node (or example) identity.