Classifying information that has subjectively perceived attributes or characteristics is difficult. When the information is one or more musical compositions, classification is complicated by the widely varying subjective perceptions of the musical compositions by different listeners. One listener may perceive a particular musical composition as “hauntingly beautiful” whereas another may perceive the same composition as “annoyingly twangy.”
In the classical music context, musicologists have developed names for various attributes of musical compositions. Terms such as adagio, fortissimo, or allegro broadly describe the strength with which instruments in an orchestra should be played to properly render a musical composition from sheet music. In the popular music context, there is less agreement upon proper terminology. Composers indicate how to render their musical compositions with annotations such as brightly, softly, etc., but there is no consistent, concise, agreed-upon system for such annotations.
As a result of rapid movement of musical recordings from sheet music to pre-recorded analog media to digital storage and retrieval technologies, this problem has become acute. In particular, as large libraries of digital musical recordings have become available through global computer networks, a need has developed to classify individual musical compositions in a quantitative manner based on highly subjective features, in order to facilitate rapid search and retrieval of large collections of compositions.
Musical compositions and other information are now widely available for sampling and purchase over global computer networks through online merchants such as Amazon.com, Inc., barnesandnoble.com, cdnow.com, etc. A prospective consumer can use a computer system equipped with a standard Web browser to contact an online merchant, browse an online catalog of pre-recorded music, select a song or collection of songs (“album”), and purchase the song or album for shipment direct to the consumer. In this context, online merchants and others desire to assist the consumer in making a purchase selection and desire to suggest possible selections for purchase. However, current classification systems and search and retrieval systems are inadequate for these tasks.
A variety of inadequate classification and search approaches are now used. In one approach, a consumer selects a musical composition for listening or for purchase based on past positive experience with the same artist or with similar music. This approach has a significant disadvantage in that it involves guessing because the consumer has no familiarity with the musical composition that is selected.
In another approach, a merchant classifies musical compositions into broad categories or genres. The disadvantage of this approach is that typically the genres are too broad. For example, a wide variety of qualitatively different albums and songs may be classified in the genre of “Popular Music” or “Rock and Roll.”
In still another approach, an online merchant presents a search page to a client associated with the consumer. The merchant receives selection criteria from the client for use in searching the merchant's catalog or database of available music. Normally the selection criteria are limited to song name, album title, or artist name. The merchant searches the database based on the selection criteria and returns a list of matching results to the client. The client selects one item in the list and receives further, detailed information about that item. The merchant also creates and returns one or more critics' reviews, customer reviews, or past purchase information associated with the item.
For example, the merchant may present a review by a music critic of a magazine that critiques the album selected by the client. The merchant may also present informal reviews of the album that have been previously entered into the system by other consumers. Further, the merchant may present suggestions of related music based on prior purchases of others. For example, in the approach of Amazon.com, when a client requests detailed information about a particular album or song, the system displays information stating, “People who bought this album also bought . . . ” followed by a list of other albums or songs. The list of other albums or songs is derived from actual purchase experience of the system. This is called “collaborative filtering.”
However, this approach has a significant disadvantage, namely that the suggested albums or songs are based on extrinsic similarity as indicated by purchase decisions of others, rather than based upon objective similarity of intrinsic attributes of a requested album or song and the suggested albums or songs. A decision by another consumer to purchase two albums at the same time does not indicate that the two albums are objectively similar or even that the consumer liked both. For example, the consumer might have bought one for the consumer and the second for a third party having greatly differing subjective taste than the consumer. As a result, some pundits have termed the prior approach as the “greater fools” approach because it relies on the judgment of others.
Another disadvantage of collaborative filtering is that output data is normally available only for complete albums and not for individual songs. Thus, a first album that the consumer likes may be broadly similar to second album, but the second album may contain individual songs that are strikingly dissimilar from the first album, and the consumer has no way to detect or act on such dissimilarity.
Still another disadvantage of collaborative filtering is that it requires a large mass of historical data in order to provide useful search results. The search results indicating what others bought are only useful after a large number of transactions, so that meaningful patterns and meaningful similarity emerge. Moreover, early transactions tend to over-influence later buyers, and popular titles tend to self-perpetuate.
In a related approach, the merchant may present information describing a song or an album that is prepared and distributed by the recording artist, a record label, or other entities that are commercially associated with the recording. A disadvantage of this information is that it may be biased, it may deliberately mischaracterize the recording in the hope of increasing its sales, and it is normally based on inconsistent terms and meanings.
In still another approach, digital signal processing (DSP) analysis is used to try to match characteristics from song to song, but DSP analysis alone has proven to be insufficient for classification purposes. While DSP analysis may be effective for some groups or classes of songs, it is ineffective for others, and there has so far been no technique for determining what makes the technique effective for some music and not others. Specifically, such acoustical analysis as has been implemented thus far suffers defects because 1) the effectiveness of the analysis is being questioned regarding the accuracy of the results, thus diminishing the perceived quality by the user and 2) recommendations can only be made if the user manually types in a desired artist or song title from that specific website. Accordingly, DSP analysis, by itself, is unreliable and thus insufficient for widespread commercial or other use.
With the explosion of media entity data distribution (e.g. online music content), comes an increase in the demand by media authors and publishers to authenticate the media entities to be authorized, and not illegal copies of an original work such to place the media entity outside of copyright violation inquires. Concurrent with the need to combat epidemic copyright violations, there exists a need to readily and reliably identify media entity data so that accurate metadata can be associated to media entity data to offer descriptions for the underlying media entity data. Metadata available for a given media entity can include artist, album, song, information, as well as genre, tempo, lyrics, etc. The underlying computing environment can provide additional obstacles in the creation and distribution of such accurate metadata. For example, peer-to-peer networks exasperate the problem by propagating invalid metadata along with the media entity data. The task of generating accurate and reliable metadata is made difficult by the numerous forms and compression rates that media entity data may reside and be communicated (e.g. PCM, MP3, and WMA). Media entity can be further altered by the multiple trans-coding processes that are applied to media entity data. Currently, simple hash algorithms are employed in processes to identify and distinguish media entity data. These hashing algorithms are not practical and prove to be cumbersome given the number of digitally unique ways a piece of music can be encoded.
Accordingly there is a need for improved methods of accurately recognizing media content so that content may be readily and reliably authorized to satisfy copyright regulations and also so that a trusted source of metadata can be utilized. Generally, metadata is embedded data that is employed to identify, authorize, validate, authenticate, and distinguish media entity data. The identification of media entity data can be realized by employing classification techniques described above to categorize the media entity according to its inherent characteristics (e.g. for a song to classify the song according to the song's tempo, consonance, genre, etc.). Once classified, the present invention exploits the classification attributes to generate a unique fingerprint (e.g. a unique identifier that can be calculated on the fly) for a given media entity. Further, fingerprinting media is an extremely effective tool to authenticate and identify authorized media entity copies since copying, trans-coding, or reformating media entities will riot adversely affect the fingerprint of said entity. In the context of metadata, by using the inventive concepts of fingerprinting found in the present invention, metadata can more easily, efficiently, and more reliably be associated to one or more media entities. It would be desirable to provide a system and methods as a result of which participating users are offered identifiable media entities based upon users' input. It would be still further desirable to aggregate a range of media objects of varying types and the metadata thereof, or categories using various categorization and prioritization methods in connection with media fingerprinting techniques in an effort to satisfy copyright regulations and to offer reliable metadata.