One embodiment of the invention relates to obtaining metadata that is associated with content. The term metadata is broadly used in this patent document and may include, e.g., data that describes content, quality, condition, origin, and other characteristics of data (e.g., a song or video title, an artist's name, related songs or content, copyright information, online purchase information, links to other information or websites, ownership information, etc.) The term “content” also may encompass a wide variety of items including, e.g., audio, video, imagery, and other electronic content items. Sometimes we use the terms “media”, “media content” and even “signal” to describe “content.”
Returning to the above mentioned embodiment, a “content identifier” may be computed from a content signal. A content identifier is a value or number used to identify the content. There are numerous ways to compute a content identifier. For example, digital watermarking may be used. Most commonly, digital watermarking is applied to media such as images, audio signals, and video signals through slight variations to sample values of the media. For example, the variations may be introduced to samples in a so-called orthogonal domain (also termed “non-perceptual,” e.g., MPEG, DCT, wavelet, etc.) or in quantization values, pixel values, audio samples, or data representing such. The assignee's U.S. Pat. Nos. 5,862,260, 6,122,403 and 6,614,914 are illustrative of certain digital watermarking technologies and are each hereby incorporated by reference. We also expect that so-called “fingerprinting” can be used to determine a CID. A fingerprint (e.g., a hash, derived signature or reduce-bit representation of content) statistically identifies content item.
Once a content identifier if obtained, as discussed above, a “layered content identifier” can be formed. A layered content identifier preferably includes the content identifier (discussed above), an “identity provider identifier” and a “metadata claim.” The “identity provider identifier” is a value or number that identifies an “Identity Provider”. An Identity Provider is a party, entity or system that provides identity preferably in the form of, e.g., metadata related to content, typically in adherence with predetermined direction and business rules. A “metadata claim” may include, e.g., metadata or data indicating a preferred format or scheme for metadata when obtained or supplied.
Again, returning to the above embodiment, a resolution request is issued to a routing service to obtain metadata associated with the layered content identifier. The routing service interprets the layered content identifier by, e.g., forwarding the metadata claim to an identity provider identified by the identity provider identifier. Then, in response to the resolution request, the metadata associated with the layered content identifier is received.
Before discussing additional and alternative embodiments, a few items of background are informative.
Generally, information exists if it has been “acted upon” (e.g., interpreted, internalized, inferred; see also, dissected, etc.) to gain understanding. One example of how this is accomplished is based on an existence of metadata associated with the information, whether it be implicit and/or explicit, that allows an observer to climb a semantic stack and place the metadata information within an ontological model. For discussion, let's assume the following premise:                (n) information: a message received and understood        
Information as represented in image, audio and video content, is of immeasurable value, but is regularly disseminated with incomplete (or incorrect) metadata that is essential to gain understanding of the content, be it for an end user or an associated infrastructure, e.g., the Semantic Web. (The Semantic Web includes an extension of the World Wide Web in which semantics of information and services are defined, helping to understand and fulfill requests from people and machines to use web content.)
What is needed is a mechanism to identify content and provide appropriate metadata to enable interpretation and action at increasingly higher layers in a semantic stack, e.g., from operations on raw data to execution of business rules. A related discussion on business rules is found, e.g., in assignee's U.S. patent application Ser. No. 11/614,947, filed Dec. 21, 2006 (published as US 2007-0208711 A1).
For small volumes of content that are relatively static, Operating System (OS) constructs such as filenames, icons, etc., are typically used to carry metadata that is “self identifying”, for example, a picture may be named “FamilyVacation2007.mpg”. As the volume of content increases, OS constructs are relegated to identifying labels that can be acted upon to extract metadata from an implemented system/file format, such as with an asset management system or from a file format that supports metadata (e.g., XMP, etc.)
To enforce binding an actionable “label” (e.g., a content identifier) to content, and to an implemented system for the retrieval of metadata, cryptographic constructs have been used in the form of traditional Digital Rights Management systems (DRM). DRM systems allow content owners to determine where/when content is accessed (e.g., decrypted), providing an opportunity to help ensure that appropriate infrastructure is in place to make the label actionable.
To contend with the existence of multiple DRM solutions, efforts are underway to provide DRM interoperability in hopes that content labels remain actionable and metadata can be retrieved/distributed across DRM boundaries (e.g., DRM interoperability project “Coral,” etc.), but progress has been slow.
Assuming the efforts are successful, these techniques are still reliant on the identifying label (e.g., filename, header, or other out-of-band information, etc.) remaining intact, something that rarely occurs once the content is publicly available.
Also, the reality that all content is ultimately consumed in an analog form that strips away any of the delivery and labeling constructs, creates the ever present threat that content may be re-captured and a new digital instance of the content created, with the instance likely missing or having incorrect metadata and incorrect (or missing) labels.
One result is that a significant volume of content cannot be accurately identified, greatly complicating attempts to manage/leverage the content and ultimately leading to confusion, frustration and lower rates of consumption by consumers.
One challenge then is to create an infrastructure that provides information in the form of identifying metadata from the content itself, independent of representation and environment.
We return now to additional embodiments of the invention.
In one embodiment, a layered approach is provided, where identifying metadata is provided by a tiered set of components, each building on the services (or information) offered or provided by a lower layer. One result includes an ecosystem, a “Content ID System,” that specifies labels (e.g., a Content ID) and infrastructure to arrive at an implemented system that is open, scalable and content agnostic.
In another embodiment, a method for media content identity resolution is provided. The method includes: computing a content identifier from a media content signal; forming a layered content identifier, the layered content identifier including the content identifier, an identity provider identifier and a metadata claim; issuing a resolution request to a routing service to get metadata associated with the layered content identifier, the routing service interpreting the layered content identifier by forwarding the metadata claim to an identity provider identified by the identity provider identifier; and receiving in response to the resolution request, the metadata associated with the layered content identifier.
In yet another embodiment, a computer readable medium on which is stored instructions comprising a metadata client is provided. The metadata client includes: an external interface including a content interface for receiving a content signal and a request to provide metadata associated with the content signal, and a metadata interface for providing metadata associated with the content signal; and an internal interface including an identity provider interface for integrating an identity provider driver into the metadata client, the identity provider driver operable to compute a content identifier from the content signal and provide the content identifier to the metadata client. The metadata client invokes the identity provider driver through the internal interface to request the content identifier, and the metadata client provides the metadata associated with the content signal via the content identifier through the metadata interface.
The foregoing features, embodiments and advantages will be even more readily apparent from the following detailed description, which proceeds with reference to the accompanying drawings. Of course, additional combinations and embodiments are provided as well.