With the ever increasing use of electronic machines to store and access data, computers and computer software are presented with an increasingly difficult task of sifting through vast quantities of irrelevant information in answering search queries. Simply put, conventional computer technology is not very good at understanding the context of search terms, and there exists little in the way of a “library card” index to permit computers and computer software to filter out irrelevant uses of search terms; to provide one hypothetical example, a conventional computer search based on a keywords of “Mexico” and “travel” might return any electronically-accessible resource having both of these words, without ready ability to distinguish different types of documents or other resources based on context, e.g., “travel books” about the country of “Mexico” from other books such as fiction.
In part to address this difficulty, information technologists have attempted to create standards for scalable, easily defined languages that help machines process electronic data based on vocabularies that provide context; one such language is the resource description framework, or the “RDF”, which provides standard way of expressing attributes of electronic information. It is hoped that through the use of descriptive languages such as provided by the RDF, information technology will find better, faster, and more accurate ways of sifting through the vast amount of information accessible to computers or computer networks (including the world-wide web).
With the generation of descriptive languages, however, there is also potential for abuse; in this regard, descriptive language systems such as the RDF provide a framework for expressing statements about an object (e.g., an electronic document, such as a web-page), and one usage of these statements is to provide context for “metadata” stored transparently as part a document. With many electronically-stored documents, there is a need to verify authenticity of statements made about those documents (and not just the documents themselves). To provide one example of such a document, one can imagine a web page that describes a piece of real property, and a statement within that web page that provides contact information for the “owner” of that piece of real property; clearly, there is the potential for abuse if one can forge the identity of the owner (without forging the visible part of the document, i.e., the description of the real property itself). The need to authenticate descriptive statements about an electronic resource such as a document generally arises when the resource and its statements are stored remotely (or transmitted through a remote source) and one wants to verify that the resource or its statements have not been tampered with.
Conventionally, metadata, RDF expressions and other descriptive statements (“descriptive statements” or “descriptive data”) can be authenticated using a digital signature scheme. There are many information processing techniques for such signature processes and, generally, they operate by arranging a collection of statements in a data string of a specific format, breaking that string up into blocks of data (e.g., 512 bit blocks) and then processing those blocks to concatenate a relatively short (e.g., 160 bit) value as a hash that is difficult to exactly duplicate with even a slightly modified message; this hash is then encrypted using a secret “private key,” which can feasibly be decrypted only using an associated, published counterpart key called a “public key.”
While generally successful for their intended purposes, most conventional signature schemes have several processing requirements that present obstacles to ready use in authentication of descriptive statements. First, hashing is generally performed in a collective manner and provides a different result if the descriptive statements are changed at all, even in their relative order; as a result, authentication schemes typically first rely upon a sorting of descriptive statements retrieved from a data store to a common, predetermined order. Without this sorting, storage or processing order of downstream machines may result in a very difficult attempt to verify a hash based upon re-ordered statements, and there exists a substantial likelihood of a failure to authenticate statements that are in fact legitimate. Sorting, however, often requires a substantial amount of time (proportional to the term nlogn), particularly if the number (n) of statements is large. Second, if it is desired to add a new statement to an existing electronic object (e.g., communicating that “this document was later modified on date X”), the original document must typically first be authenticated, the new statement added, the statements re-sorted, and a completely new hash re-computed; otherwise stated, the conventional processes provide no easy, computationally simple way to add and authenticate new statements. Third, some descriptive statement methodologies employ “blank node” techniques, where every recipient or storing machine can create and apply its own “label” to identify certain electronic nodes; the net effect of these labels can be to change the descriptive statements in a data storage in a way that does not undermine their authenticity, but that does tend to lead to a failure to authenticate (because a “hash” of the label-modified statements generally will not match the original hash represented by the digital signature). These difficulties have only hindered some applications of descriptive statement methodologies, e.g., of the RDF.
A need exists for a computationally efficient system for processing descriptive electronic statements about objects. More particularly, a need exists for a system that can rapidly digitally sign (or authenticate) a set of statements, if possible, without being always required to sort those statements to a predetermined order. Still further, a need exists for a verification system that is insensitive to varying blank node expression techniques (whether employed a data storage system or intermediate node). Finally, a need exists for a system that can rapidly, securely and efficiently compute a new hash, and digitally re-sign, a set of statements that have been legitimately modified. The present invention satisfies these needs and provides further, related advantages.