Clustering and classification tend to be important operations in certain data mining applications. For instance, data within a dataset may need to be clustered and/or classified in a data system with a purpose of assisting a user in searching and automatically organizing content, such as recorded television programs, electronic program guide entries, and other types of multimedia content.
Generally, many clustering and classification algorithms work well when the dataset is numerical (i.e., when datum within the dataset are all related by some inherent similarity metric or natural order). Numerical datasets often describe a single attribute or category. Categorical datasets, on the other hand, describe multiple attributes or categories that are often discrete, and therefore, lack a natural distance or proximity measure between them.
Given that a user is interested in an object in a first medium, it is desirable to find objects in a different medium that the user might be interested in. For example, if the user is interested in a musical artist, it may be desirable to retrieve movies in which the musical artist's songs appear.