The growth of music resources on personal devices and Internet radio has altered the channels for music sales and increased the need for music recommendations. For example, store-based and mail-based CD sales are dropping while music portals for electronic distribution of music (bundled or unbundled) like iTunes, MSN Music, and Amazon are increasing.
Another factor influencing aspects of music consumption is the increasing availability of inexpensive memory devices. For example, a typical mp3 player with 30 G hard disk can hold more than 5,000 music pieces. With such a scale for a music collection, a “long tail” distribution may be observed for a user's listening history. That is, in a user's collection, except for a few pieces that are frequently played, most pieces are visited infrequently (e.g., due to a variety of factors including those that make some potentially useful operations of portable devices practically inconvenient). Even on desktop computers, it is usually a tedious task to select a group of favorite pieces from a larger music collection. Therefore, music recommendation is highly desired because users need suggestions to find and organize pieces closer to their taste.
While techniques to generate recommendations can be useful for an individual user consuming her own personal collection, they are also useful for an individual user wanting to add new pieces to her collection. Consequently, commercial vendors are keenly aware of the need to help consumers find more interesting songs. Many commercial systems such as Amazon.com, Last.fm (http://www.last.fm), and Pandora (http://www.pandora.com) have developed particular approaches for music recommendation. For example, Amazon.com and Last.fm adopt collaborative filtering (CF)-based technologies to generate recommendations. For example, if two users have similar preferences for some music songs, then these techniques assume that these two users tend to have similar preferences for other songs (e.g., song that they may not already own or are aware of). In practice, such user preference is discovered through mining user buying histories. Some other companies such as Pandora utilize content-based technologies for music recommendations. This technique recommends songs with similar acoustic characteristics or meta-information (like composer, theme, style, etc.).
To achieve reasonable suggestions, CF-based methods should be based on large-scale rating data and an adequate number of users. However, it is hard to extend CF-based methods to applications like recommendation on personal music collections due to the lack of a community. Moreover, CF-based methods still suffer from problems like data sparsity and poor variety of recommendation results.
Content-based techniques can meet the requirements of more application scenarios, as they simply focus on properties of music. Content-based techniques can be further divided into metadata-based and acoustic-based methods. Metadata, which includes properties such as artists, genre, and track title, are global catalog attributes supplied by music publishers. Based on such attributes, some criteria or constraints can be set up to filter favorite pieces. However, building optimal suggestion sequences based on multiple constraints is an NP-hard problem. Although some acceleration algorithms like simulated annealing have been proposed, it is still difficult to extend such methods to a scale with thousands of pieces and hundreds of constraints. Also based on metadata, some other methods utilized statistical learning to construct recommendation models from existing playlists. Due to the limitation of training data, such learning-based approaches are also difficult to scale up. Furthermore, metadata can be too coarse to describe and distinguish the characteristics of a piece of music. And, in practice, it's also hard to obtain complete and accurate metadata in most situations.
Another approach to music recommendation uses acoustic-based techniques. Such techniques tend to have fewer restrictions than CF and content-based techniques. Further, acoustic-based techniques to music recommendation are suitable for situations where consumers or service providers own the music data themselves. In general, acoustic-based techniques first extract some physical features from audio signals, and then construct distance measurements or statistical models to estimate the similarity of two music objects in the acoustic space. A recommendation can match music pieces with similar acoustic characteristics and group these as suggestion candidates.
As described herein, various exemplary methods, devices, systems, etc., generate music recommendations in a scalable manner based at least in part on acoustic information and optionally other information in a multimodal manner.