Due to recent advances in technology, computer users are now able to enjoy many features that provide an improved user experience, such as playing various media and multimedia content on their personal or laptop computers. For example, most computers today are able to play compact discs (CDs) so users can listen to their favorite musical artists while working on their computers. Many computers are also equipped with digital versatile disc (DVD) drives enabling users to watch movies.
As users become more familiar with advanced features on their computers, such as those mentioned above, their expectations of the various additional innovative features will undoubtedly continue to grow. Users often desire to receive media metadata, which includes content-related data associated with digital media files such as those from CDs and DVDs. Independent data providers (IDPs), such as Loudeye Corporation and All Media Guide (AMG) of Alliance Entertainment Corp., capture a vast amount of information related to music CDs and other digital media. IDPs usually enter the collected data manually and store and manage the data using their own particular data entry application. Each IDP uses a different format for identifying content. Those skilled in the art are also familiar with media metadata services that collect information from users when metadata for a specific, requested media file is unavailable from an IDP. For example, consider a media player software application that enables a user to play a CD on his or her computer. Typically, the application allows the user to display track information associated with the CD by clicking on an appropriate user interface (UI). Such track information may include track number, song title, playing time, and the like.
The wide and varied tastes of computer users in music, movies, and the like create the need for an enormous corpus of metadata. As such, data publications of media metadata tend to be very large and experience a high volume of query traffic (e.g., several multi-gigabytes in size and under constant access). Also, the same logical content may have many different physical representations, which makes it difficult to identify and retrieve the correct media metadata for a specific media file. Moreover, the same piece of content from different data providers and/or in different cultures may be identified differently. These problems complicate the storage, management, and retrieval of media metadata, particularly in the context of a large database with data collected from multiple sources.
International Standard ISO 15707, published by the International Organization for Standardization (ISO) on Nov. 15, 2001, provides one scheme for identifying a logical piece of work. In general, ISO 15707 defines the format, administration, and rules for allocating an international standard musical work code (ISWC) to a musical work. The ISWC uniquely distinguishes one musical work from another within computer databases and related documentation for those involved in the administration of rights to musical works. The standard's goal is to reduce errors when information about musical works is exchanged between rights societies, publishers, record companies, and other interested parties on an international level. As defined in ISO 15707, the ISWC includes a prefix element followed by a nine-digit numeric code and a check digit. Unfortunately, this standard focuses on rights management rather than data management and aggregation and is limited in scope to musical works. Moreover, the existing standard does not provide for associating and mappings related identifiers, which is important when providing useful media metadata.
Those skilled in the art are also familiar with various tagging schemes for identifying digital content. For example, an ID3 tag residing at the end of an audio file can include title, artist, album, year, genre, track, and a comment field. In other words, known tagging systems embed data about the content directly in the content. The problem is that this metadata can become stale and even incorrect. While the ID3 standard provides for an identifier, it is merely a placeholder and there is no specification on how it is to be used. Moreover, existing tagging schemes also fail to address associations and mappings between identifiers.
Accordingly, this invention arose out of concerns for providing systems and methods for processing data to improve the breadth and quality of stored media metadata and, thus, improve the processing of media content to provide an enhanced, rich, and robust user experience. Improvements in identifying media content and related information, as well as in the techniques used to store, retrieve, aggregate, and associate identifiers is desired. Such improvements should permit building a media data warehouse capable of aggregating data from many different sources. In this regard, an identification system, or ID registry, is desired to uniquely identify the same piece of content from different data providers, in different cultures, and in different physical forms to allow a consistent set of data to be stored and retrieved.