1. Field of the Invention
This invention relates in general to computer operating systems, and in particular, to a method for storing and retrieving data in a computer system.
2. Description of Related Art
One problem, which has been retained throughout the history of digital computing, is the inability to directly identify a data entity, have the ability to move it, and still maintain all references to the entity.
For example, consider the state-of-the-art methods of information storage. One state-of-the-art method is a containment technique where an entity (directory) contains a list of subordinate entities (directories or files). Although the containment method is highly effective in ordering information, it does not support the general ability to access information after the information has been moved. Moving data within a hierarchy inevitably leads to the invalidation of existing links to the information moved. There are three prior art solutions to this problem.
The first solution is the concept of a current or working directory. A current directory is a state where a program identifies a particular location within secondary storage and all operations assume information is located in this directory or is subordinate to it.
The second solution is an absolute path. An absolute path is where a program identifies information by a stream of hierarchically dependant entities which are concatenated together to form an absolute data specification. Thus, if "C" is within "B" and "B" is within "A", then an absolute path would be " A B C".
The third solution is the concept of search path. A search path is a list of predefined locations where programs and sometimes program data are stored. If a body of requested data is not within the search path or the current directory, then it is assumed not to exist. Note many systems use combinations of these methodologies.
However, there remains the problem, common to all the state-of-the-art methodologies defined above, of maintaining information links to data which has moved. Consider a program which accesses data with an absolute path A B C. If the data located in C is moved to a different location, then the program will not be able to locate the data it needs. Further, consider a program which uses the concept of current directory, where the data file is moved out of the current directory. Finally, consider the program which uses data located in a search path, where the data is moved to a directory not in the search path. In all of these instances the data is effectively lost to the program. The same reasoning applies to data within a database. Consider the examples above but replace directories with files and files with sets of records.
Thus, the problem is how to retain direct information links when data is moved. In almost all cases, the information link is recorded in a static image, such as a program or in a database.
The solution to the problem of directly linking information and still having the ability to move it, is dependant on altering some of the basic perceptions regarding information structure. The state-of-the-art approach is to use volumes, directories and files to identify each data entity. Where volumes and directories are methods of containment. Specifically, a volume physically contains directory identifiers and directories physically contain file identifiers. Thus, all files and directories identified within a volume must physically exist within that volume.
The basic premise of the containment method of information structure storage is that a hierarchy is based on more significant entities physically containing less significant entities. Although this premise works, it is not efficient in that it prohibits several of the characteristics identified below. The following are perceptional modifications required to understand the present invention.
First, a data object may logically belong to a larger data object, but that does not mean it has to be physically contained within it. For example, a directory can contain many file identifiers, but that does not mean the file identifiers must be physically contained in the directory. In the present invention, each file identifier can identify the directory it belongs to and a more effective information structure storage method can be used.
Second, all information structure is hierarchical to some degree. When information ceases to have a hierarchical structure (for example relational), the existing containment method (volume, directory, file) of storing hierarchical structure does not work unless additional intermediate processes are used. The present invention can support direct links to information at a fundamental level and can therefore directly support non-hierarchical information structures such as relational or object oriented, where the containment method cannot.
Third, the amount of data structure required by any process may vary in size, depth and width. Therefore, any mechanisms, which cannot handle extremes in size, depth and width are ineffective. For example, consider a directory with 10,000 file identifiers and the problems associated with its use and maintenance when in memory. The present invention experiences similar space restrictions, but does not experience as many or as severe problems as containment methodology with regard to use or maintenance.
Fourth, the amount of space required to uniquely identify a data entity deep within a hierarchy can be excessive using standard containment methodology. The containment method of uniquely identifying a data entity becomes progressively less efficient as depth within a hierarchy increases. For example, the following are two strings to identify a body of data deep in a hierarchical structure:
1--"C: ACC.sub.-- PROG YEAR1992 ACCOUNTS ONTARIO TORONT O CASH.DB"
2--"C: MY.sub.-- DISK12 VOLUME.sub.-- A UTILS.sub.-- AC ACC.sub.-- PROG YEAR 1992 ACCOUNTS ONTARIO TORONTO CASH.DB"
The first data identifier is over 50 bytes long and could easily be much longer as depicted in the second data identifier, which is over 80 bytes. In the present invention, a direct reference is a maximum of 20 bytes long, regardless of how deep the reference is within a hierarchy.
Fifth, it is safe to assume that information moves. The containment method of storing information structure requires the data identifier to move when the data moves. This produces two problems, the overhead of moving the identifier and more importantly, the problems associated with losing links to the data moved. The present invention ensures that data links are retained, regardless of where the data is moved to.
Finally, information accessed via direct links are faster than indirect links. The containment method of data-structure-storage uses names to identify a data entity. This means that the location of a data identifier is established by a search, which is seldom a binary search and never an aggregate indexed reference. As a result, the overhead associated with locating a body of data using the containment methodology is slow and cumbersome. The present invention of data-structure-storage is efficient because data location is either a binary search or an aggregate indexed reference.