Audio and video media comprise an essentially ubiquitous feature of modern activity. Multimedia content, such as most modern movies, includes more than one kind of medium, such as both its video content and an audio soundtrack. Modern enterprises of virtually every kind and individuals from many walks of life use audio and video media content in a wide variety of both unique and related ways. Entertainment, commerce and advertising, education, instruction and training, computing and networking, broadcast, enterprise and telecommunications, are but a small sample of modern endeavors in which audio and video media content find common use.
Audio media include music, speech and sounds recorded on individual compact disks (CD) or other storage formats, streamed as digital files between server and client computers over networks, or transmitted with analog and digital electromagnetic signals. Examples of video media include movies and other recorded performances, presentations and animations, and portions thereof, sometimes called clips. It has become about as familiar to find users watching movies from Digital Versatile Disks (DVD) playing on laptop computers while commuting as at home on entertainment systems or in theaters. Concerts from popular bands are streamed over the internet and enjoyed by users as audio and/or viewed as well in webcasts of the performance. Extremely portable lightweight, small form factor, low cost players of digital audio files have gained widespread popularity. Cellular phones, now essentially ubiquitous, and personal digital assistants (PDA) and handheld computers all have versatile functionality. Not just telecommunication devices, modern cell phones access the Internet and stream audio and video content therefrom and, it is no longer unusual to find game enthusiasts participating in networked video game play and fans watching sporting events therewith.
As a result of its widespread and growing use, vast quantities of audio and media content exist. Given the sheer quantity and variety of audio and video media content that exist, and the expanding growth of that content over time, an ability to identify content is of value. Media fingerprints comprise a technique for identifying media content.
Media fingerprints are unique identifiers of media content from which they are extracted or generated. The term “fingerprint” is aptly used to refer to the uniqueness of these media content identifiers, in the sense that human beings are uniquely identifiable, e.g., forensically, by their fingerprints. While similar to a signature, media fingerprints perhaps even more intimately and identifiably correspond to the content. Audio and video media may both be identified using media fingerprints that correspond to each medium.
Audio media are identifiable with acoustic fingerprints. An acoustic fingerprint is generated from a particular audio waveform as code that uniquely corresponds thereto. Upon generating an acoustic fingerprint, the corresponding waveform from which the fingerprint was generated may thereafter be identified by reference to its fingerprint. The acoustic fingerprints may be stored, e.g., in a database. Stored acoustic fingerprints may be accessed to identify, categorize or otherwise classify an audio sample to which it is compared. Acoustic fingerprints are thus useful in identifying music or other recorded, streamed or otherwise transmitted audio media being played by a user, managing sound libraries, monitoring broadcasts, network activities and advertising, and identifying video content (such as a movie) from audio content (such as a soundtrack) associated therewith.
The reliability of an acoustic fingerprint relates to the specificity with which it identifiably corresponds with a particular audio waveform. Some audio fingerprints provide identification so accurately that they may be relied upon to identify separate performances of the same music. Moreover, some acoustic fingerprints are based on audio content as it is perceived by the human psychoacoustic system. Such robust audio fingerprints thus allow audio content to be identified after compression, decompression, transcoding and other changes to the content made with perceptually based audio codecs; even codecs that involve lossy compression (and which may thus tend to degrade audio content quality). Analogous to identifying audio media content by comparison with acoustic fingerprints is the ability to identify video media with digital video fingerprints.
Video fingerprints are generated from the video content to which they correspond. A sequence of video information, e.g., a video stream or clip, is accessed and analyzed. Components characteristic of the video sequence are identified and derived therefrom. Characteristic components may include luminance, chrominance, motion descriptors and/or other features that may be perceived by the human psychovisual system. The derived components are compressed into a readily storable and retrievable format.
Video fingerprints are generated using relatively lossy compression techniques, which render the fingerprint data small in comparison to their corresponding video content. Reconstructing original video content from their corresponding video fingerprints is thus typically neither practical nor feasible. As used herein, a video fingerprint thus refers to a relatively low bit rate representation of an original video content file. Storing and accessing the video fingerprints however is thus more efficient and economical than storing the original video content, from which the fingerprints are derived, in its entirety.
Stored video fingerprints may be accessed for comparison to a sample of a video sequence, which allows accurate identification of the video content in the sequence. Video fingerprints are thus useful for accurately identifying video content for a user as the content is viewed, as well as in authoritatively managing copyrights, and in validating authorized, and detecting unauthorized, versions and instances of content being stored, streamed or otherwise used. As with many acoustic fingerprints moreover, video fingerprints are perceptually encoded. Thus the content of the video sequence may be accurately identified by comparison to video fingerprints after compression, decompression, transcoding and other changes to the content made with perceptually based video codecs; even codecs that involve lossy compression (and which may thus tend to degrade video content quality).
Audio and video media content may be conceptually, commercially or otherwise related in some way to separate and distinct instances of content. The content that is related to the audio and video content which may include, but is not limited to other audio, video or multimedia content. For instance, a certain song may relate to a particular movie in some conceptual way. Other example may be text files or a computer graphics that relate to a given speech, lecture or musical piece in some commercial context. However, it may not be easy to ascertain the existence of some content that may be related to particular media content, much less to access the related content in association with the media content.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, issues identified with respect to one or more approaches should not assume to have been recognized in any prior art on the basis of this section, unless otherwise indicated.