The prior art includes the concept of content-addressable information, its storage and retrieval, and the use of hash functions, message digests and descriptor files, as described in international publication No. WO 99/38093. International publication No. WO 99/38092 describes a particular technique for the storage and access of content-addressable information, and international publication No. WO 01/18633 describes a technique for encrypting content-addressable information. These publications are all incorporated by reference.
As discussed in the prior art, it is apparent that content-addressable techniques can be very useful for storing and accessing documents in a fashion that guarantees the integrity of the stored content. Where a document or information can evolve over time or where documents need to reference one another, though, a host of new issues are presented. Because a content-based address uniquely identifies particular content, evolving content means a new content address for the document. As discussed in the prior art, one technique is to use a message digest (such as an “MD5”) to uniquely represent a particular document. In a situation where there is a complex content space, though, there may be many documents or sets of documents needed to represent a particular type of information (such as a set of user manuals for a complex computer system, or the technical documentation for an aircraft). In these situations, a single MD5 might uniquely represent many documents (for example, using a descriptor file), and an individual document might contain many different MD5s each referencing a single document or a set of documents. Further, there may be many different versions of a set of documents that are changing over time (where some documents in the set might change and others might not), and two different documents might each need to reference one another. With such a complex content space, the management of, and access to, the information in a way that insures the integrity of the information becomes more difficult.
For example, consider the complete technical documentation for a Boeing 747 aircraft. There will be sets of documents each describing a particular subsystem of the aircraft such as the fuel subsystem, the communication subsystem, the airframe subsystem, etc. These documents will necessarily need to reference one another and they will invariably change over time. To further complicate matters, there is no single set of documentation that completely describes all 747's in use. While there may be a master set of documentation that describes generically a 747 aircraft, each individual aircraft that rolls off the assembly line with a unique serial number will have its own specific set of documentation due to the fact that it has different options and might be destined for a different airline. Thus, different versions of the original documentation exist not only because the documentation set changes over time for a particular aircraft, but also because different aircraft having different options will need different versions of the original documentation. All of this technical documentation for a 747 aircraft will then evolve over time as parts change, as procedures change, and as the hundreds of FAA directives are received and complied with.
To illustrate the nature of the problem, it is believed that once an aircraft has been manufactured and is ready for flight, it can take weeks even months to assemble all the technical documentation and to insure that the documentation has been updated and all replacement pages have been inserted in the correct locations before the aircraft will be certified for flight. Even when all the technical documentation has been updated and the latest version is available for use, it can be extremely useful in the future to be able to go back and review the version of the documentation that existed at a particular point in time. In the real world, many other examples exist where a complex set of documentation having internal references and versions needs to be stored efficiently, managed and accessed intelligently in a way that insures the integrity of the information being retrieved. As such, mechanisms and techniques are needed to manage such complex content reliably without relying on an end user or complex software applications to do so. It would be particularly desirable to make use of the prior art content-addressable storage techniques to address such a problem.