The general problem addressed by this invention is the low productivity of human knowledge workers who use labor intensive manual processes to work with collections of computer files. One promising solution strategy for this software productivity problem is to build automated systems to replace manual human effort.
Unfortunately, replacing arbitrary manual processes performed on arbitrary computer files with automated systems is a difficult thing to do. Many challenging subproblems must be solved before competent automated systems can be constructed. As a consequence, the general software productivity problem has not been solved yet, despite large industry investments of time and money over several decades.
The present invention provides one piece of the overall functionality required to implement automated systems for processing collections of computer files. In particular, the current invention has a practical application in the technological arts because it provides a convenient, scalable, and fully automated software means for associating three kinds of information important to automated collection processing systems: collection instance specifier information, collection type definition information, and collection content information.
The Collection Information Management problem is one of the most important and fundamental problems that must be solved in order to enable the construction of automated collection processing systems. It is the problem how to model and manage information about collection instances, collection content files, and collection data types that describe shared characteristics of collection instances.
Some interesting aspects of the collection information management problem include the following: large numbers of collections can exist; collections can have arbitrary per-instance specifier data; collections can contain many arbitrary computer files for content; collections can require that arbitrary processes be run on the collection content; collections can share sets of structural and processing characteristics; many software programs can require access to information about collections; collection representations must accommodate variances in computing platforms, administrative policies, and software processing tools; and collections must be resistant to scale up failure.