1. Field of the Invention
The present invention relates to methods for creating a data dictionary for defining terms of a machine-readable rights expression language, such as a metadata rights expression language, and applications for such a data dictionary. In particular, the invention relates to applications and systems using a rights data dictionary to interpret a rights expression language, and for translating between rights expression languages.
2. Description of Related Art
Dictionaries have long been used to describe the use and meaning of words in natural languages. In the natural language context, dictionaries are primarily descriptive in nature, in that the definitions in the dictionary reflect the dictionary author's observations of how words are used in the human community to which the dictionary relates. As such, natural language dictionaries are descriptive in nature. Another aspect of natural language dictionaries is that many natural language words are commonly assigned multiple meanings. This aspect can, and often does, create ambiguity in human expression. At the same time, the scope of human expression is virtually unlimited, and natural language words may be found to express virtually every thought. Moreover, new words are constantly being created to express new concepts in living languages.
In contrast, in the machine (computer) language context, languages are created by design. “Machine language” refers not only to programming languages such as assembly language, ‘C’, Fortran, HTML, and the like, but also to descriptive languages such as rights expression languages. Dictionaries—sometimes also called schemas—of machine languages are primarily prescriptive data structures. That is, machine language dictionaries prescribe the meaning and relationships of machine language terms, usually in a precise and rigid manner. Consequently, machine language expressions are intended to be clear and unambiguous, although their scope is correspondingly limited.
Increasingly, opportunities lie at the interface between natural languages and traditional machine languages. As machines become ever more sophisticated and more involved in different aspects of human life, more complex machine languages have been developed, at least partially for descriptive purposes. Such languages are used to describe the status or attributes of different objects or processes, usually in a commercial or production context. In particular, applications for descriptive, machine-readable metadata in wide area networks continues to evolve.
For example, one recently developed application area for machine languages involves digital rights management. In digital rights management applications, intellectual property rights implicated by various types of digital works may be described using a specialized language for rights expression that is capable of being machine read, herein referred to as a “rights expression language.” Digital rights expressions may be associated with products or content to which they pertain, so that appropriate actions may be automatically taken in respect of the products or content when the rights expressions are read by a suitable machine. Although the field is rapidly evolving, general concepts of digital rights management are well understood in the art. Other applications for machine-readable metadata in contexts like the World Wide Web and similar wide area networks include privacy rights, semantic-based search engines, library catalogs, and micro-payments (“digital cash”).
Of course, intellectual property rights as used in digital rights management methods involve concepts that have evolved, and continue to evolve, in the natural language sphere. As such, expression of digital rights may involve expression of a very diverse and ever-changing range of ideas, at different levels of abstraction, with sometimes subtle distinctions between terms. Correctly interpreting and acting on even relatively simple expressions of intellectual property rights may be difficult using an inflexible machine language. Compounding this difficulty is the highly granular, transient, and transferable nature of many intellectual property rights. For example, a particular work may be a compilation of many component works. Component works, in turn, may be compilations of still further component works, and so forth. Some components may be copies, while others are transformations of some kind, which imply different types of rights. Each component work may have a unique collection of rights associated with it. Ownership and application of these rights may change as a function of time, location, and intervening transactions. Thus, a machine language for expression of rights should be capable of correctly expressing potentially very complex and subtle semantic content, without becoming overly cumbersome.
Various machine languages are known in the art for the expression of digital rights. For example, the Open Digital Rights Language (ODRL™) or the eXtensible rights Markup Language™ (XrML™), an Extensible Markup Language (XML) compatible grammar for general digital content. Languages such as these, and others known in the art, use resource-based models in their vocabulary schemas. For example, XrML™ is consistent with the Resource Description Framework (RDF) model and syntax published by W3C®. Such languages tend to be developed for a particular aspect of rights management in a particular context, and correspondingly, have limited vocabularies. They may also employ different terms to describe similar or logically overlapping circumstances, making translation between such languages difficult. Present rights expression languages employ resource-based schemas for expression of digital rights: most things are described as attributes or properties of resources.
Other term sets are also known for the purpose of describing rights, which may be used with, or interact with, rights expression languages in more limited applications. For example, the Online Information Exchange message format (ONIX) has been developed for use in connection with text-based products, including electronic books. In general, these types of term sets tend to be relatively non-hierarchical (i.e., “flat”) and simple.
A resource-based data dictionary model may be described as resource-centric schema for developing metadata to describe attributes of the underlying resource. For example, resource-based metadata describing rights to an electronic resource, such as a copyrighted story, may be associated with the resource. The metadata may describe previous attributes that have arisen from previous actions associated with the resource; for example, “creation date” and “author” describe, respectively, a time and an agent of an authorship action. This approach works well so long as the point of view for which rights data is desired does not change. In practice however, to implement a sophisticated digital rights management scheme, data from various different points of view is needed.
For example, it may be desirable to gather metadata from the point of view of various agents, including authors or creators, publishers, replicators, consumers, clearing houses, etc, and various other resources for which a resource is a parent or source, in whole or in part. For example, if a particular resource is copied to a replica, metadata may be associated with the replica, indicating that the replica is a copy of a particular source, when the replica was made, who made the replica, and where the replica was made. At the same time, additional metadata may be associated with the source, indicating that a copy was made, when a copy was made, who did the copying, where the copying occurred, and the identity of the resulting replica. The relationship “is a copy of” in the phrase “B is a copy of A” is distinct from the relationship “has a copy” in the phrase “A has a copy B.” For further example, “C copied A”—a statement from the point of view of an agent ‘C’—is distinct from “A was copied by C,” from the point of view of a copied resource A.
Hence, the resource-based model, because it describes rights from the point of view of different resources, accordingly will use a different or modified term to describe essentially the same action, depending on which resource provides the operative point of view. In a digital universe where resources are propagated, combined, divided, embedded, mutated and otherwise transformed in a variety of different ways, this may lead to an undesirably complex terms and expressions in many situations. Similar complexity may accrue in the context of other descriptive metadata applications, as well.
It is desirable, therefore, to provide a more robust yet conceptually elegant data dictionary (i.e., a dictionary-like schema or data structure) that is capable supporting a language for conveying complex semantic content as needed for digital rights management, and other sophisticated applications for metadata. At the same time, however, resource-based descriptive schemas are already entrenched in several rights descriptive languages, and good reasons exist to believe that resource-based descriptive languages will continue to be created, evolve, and grow for use in rights management and many other applications. Accordingly, the robust data dictionary should be fully compatible with resource-based schemas. Terms developed using the new data dictionary should be capable of being unambiguously mapped to terms developed according to resource-based descriptive models, so that the data dictionary may be used to translate between different rights expression languages. In addition, terms developed using the schema should have unique, unambiguous meanings, and it should be possible to generate an expandable, open database of terms consistent with the schema. It is further desirable to provide an expandable, open database of terms for use in digital rights management.