The general problem addressed by this invention is the low productivity of human knowledge workers who use labor-intensive manual processes to work with collections of computer files. One promising solution strategy for this software productivity problem is to build automated systems to replace manual human effort.
Unfortunately, replacing arbitrary manual processes performed on arbitrary computer files with automated systems is a difficult thing to do. Many challenging subproblems must be solved before competent automated systems can be constructed. As a consequence, the general software productivity problem has not been solved yet, despite large industry investments of time and money over several decades.
The present invention provides one piece of the overall functionality required to implement automated systems for processing collections of computer files. In particular, the current invention has a practical application in the technological arts because it provides both humans and automated systems with a convenient, precise, scalable, and fully automated means for applying computer commands to collections of computer files.
Problems to be Solved
The Collection Command Applicator problem is one important problem that must be solved to enable the construction of automated collection processing systems. It is the problem of how to efficiently apply computer commands to large numbers of selected collections, in accordance with processing interdependencies that may exist among the collections.
Interesting characteristics of the collection command applicator problem include at least these: an arbitrary number of arbitrary collections in arbitrary filesystem locations may be involved; collections can have arbitrary per-instance data, size, content, data type, and internal structure; only a few interesting collections might require selection from a large pool of collections; collection recognition criteria may be based on complex combinations of collection type, collection per-instance data, collection content or external filesystem attributes; arbitrary processing commands can be applied; selected collections must be processed in proper dependency order; and parallel command execution may be required for performance reasons.
Solving the collection command applicator problem is useful because a good solution would deliver a clear N-fold productivity increase for the collection command application problem domain. Specifically, a good solution would enable human workers to issue 1 computer command to process N collections. In contrast, at least N low-level commands, one per collection, are theoretically required, and in current practice more than 2N commands are often required. Typically, current practices also generate additional costs for various adhoc scripts that are manually constructed to manage current command application processes.
The Collection Visit Order Problem is another important problem to solve. It is the problem of how to determine and enforce a valid execution visit ordering when applying commands to collections that have processing interdependencies among themselves.
Some interesting aspects of the collection visit order problem include: arbitrary numbers of arbitrary collections may be involved in an execution visit ordering calculation; numeric visit order rankings are awkward to work with when large numbers of collections are involved; visit order rankings can change frequently; visit order default rankings must sometimes be overridden for particular collection instances; and visit orders can change depending upon the specific commands that are being applied.
The Parallel Collection Command Execution Problem is another important problem to solve. It is the problem of how to optimally harness available parallel processing power during command application, while still maintaining proper execution visit order among collections.
Some interesting aspects of the parallel collection command execution problem include these: there is an inherent limit to the amount of parallelism that can be achieved within each set of collections to be processed; there is a physical limit to the amount of parallel processing power available in each computational environment; and there is a policy limit to the amount of parallelism that can be used by command applicators in each administrative environment. Ideally, the inherent parallelism limit should be less than the physical parallelism limit, and the physical parallelism limit should be less than the administrative parallelism limit.
The Nearby Execution Directory Problem is another important problem to solve. It is the problem of how to execute commands in particular nearby execution directories that are located around collections, both inside and outside of collections.
Some interesting aspects of the nearby execution directory problem include: some commands must be executed inside collections; some commands outside collections; some commands in specific parent or child directories; some commands in all immediate child directories; some commands in all peer directories; and some commands must even be executed in all instances of a particular directory within a subtree, without the benefit of using collections as a starting anchors or reference points for directory calculations.
General Shortcomings of the Prior Art
A professional prior art search for the present invention was performed, but produced no meaningful, relevant works of prior art. Therefore the following discussion is general in nature, and highlights the significant conceptual differences between file-oriented mechanisms in the prior art and the novel collection-oriented mechanisms represented by the present invention.
Prior art approaches lack support for collections. This is the largest limitation of all because it prevents the use of high-level collection abstractions that can significantly improve productivity.
Prior art approaches lack collection recognition means that use collection content, collection data type, and collection per-instance data in collection recognition activities.
Prior art approaches lack execution visit ordering means to control the order in which commands are applied to particular collections within a set of collections, thereby ensuring the orderly processing of interdependencies among processed collections.
Prior art approaches lack parallel execution means for optimally processing collections in parallel, especially when execution visit ordering must be maintained within a parallel execution environment.
Prior art approaches lack indirect command execution means such as script files, thereby preventing the creation and use of persistent, reusable visit orderings and parallel execution orderings for processing collections.
As can be seen from the above description, prior art mechanisms in general have several important disadvantages. Notably, general prior art mechanisms do not support collections, and do not support visit ordering. These are the two most important limitations of all.
In contrast, the present invention has none of these limitations, as the following disclosure will show.