The mechanism used for storing and retrieving data depends upon the data model. The data model is a representation of the data in terms of fields, and convenient groupings of fields into larger structures such as tables. Also, a data model may have logical links between groups of data such as are provided by classes in a hierarchical hierachical data structure. Ordinarily, the mechanism used to store data is tailored to the data model chosen to represent the data.
For example, in a spread sheet the data model has: a first table of fields where each field represents a cell of the spread sheet, a second table of headings for the columns; a third table of headings for the rows; and at least one, and possibly several, tables holding the formulas used to calculate the contents of cells designated to hold computed values in the spread sheet.
As a second example, scientific data received in a computer from an instrument measuring physical quantities is represented by a data model having at least the following fields: physical quantity being measured; type of instrument; manufacturer; serial number; date of measurement; time of measurement; sample identification; and many fields for holding the data. The fields holding the data obtained by measuring physical quantities may be one dimensional if only a single number is recorded, or may be two dimensional if two numbers are recorded such as an independent variable and a dependent variable, or the fields may be multidimensional if several independent and dependent physical quantities are recorded.
As a third example, financial data is collected by a corporation from many sources into modules, where each module may use a different data model.
Additionally, personnel data is collected by corporations into different modules, and each of these modules may be based upon a different data model.
In addition to simply storing data, it is often convenient to interchange data from one module to another module. For example, in the scientific data area, obtaining measurements from a scientific instrument, and writing the measurements into a spread sheet for mathematical analysis of the measurement data. An even further desirable interchange is to move the calculated results from the spread sheet into another software tool such as a word processor for preparation of written reports, or move the results into an alarm software tool for alerting an operator that physical measurements have exceeded desired bounds. The alarm software tool may accept data directly from the instrument, and may also receive computed results from the spread sheet.
And further, many forms of commonly gathered data are collected from a variety of sources and stored by software modules into computer media. Examples of data gathered and stored by software modules are financial and personnel data of corporations, census data, demographic statistics, marketing data showing customer preferences, stock market price and volume data, and so forth. The computer media used for storage of the data include disk units on a local computer, servers on a computer network, disk farms on a computer cluster, magnetic disks reached over a computer communications network, optical storage units reached over a computer communications network, and other diverse computer storage media technologies. In all of these examples of data collected through the use of computer modules, the problem of exchanging data collected by different modules is particularly important.
The need to exchange data collected through different software modules, by way of example, is particularly acute for corporate personnel and financial data. A corporation having many branches, subsidiaries, etc., may have a different computer module at each location collecting the same type of data. Assembling the corporate wide data requires exchanging the output of the diverse computer modules. Accordingly, a continuing problem in modern corporations is to provide a simple and efficient means for exchanging data between different software modules.
An additional desirable feature is that data is often hierarchical by the nature of the data, and it is therefore desirable to provide a hierarchical mechanism for storage and interchange, where the mechanism reflects the hierarchical nature of the data.
A longstanding difficulty in storing and interchanging data between software modules is the incompatibility of the data models used to represent the data in the various modules. In the past, it has been necessary to design an interchange tool for each desired pair of modules, leading to the number of interchange tools increasing at least as fast or faster than the square of the number of modules.
For example, the difficulty in the scientific instrument area arises, in part, from the large number of types of instruments, the variety of manufacturers for each type of instrument, and the large number of data analysis modules available in the marketplace. For example, one instrument and one spread sheet require one interchange tool, two instruments and one spread sheet probably require two interchange tools, but, as in the above example where output of each instrument is directed to two receiving modules such as a spread sheet and an alarm module, four interchange tools are required.
The data model of each type of scientific instrument will depend upon the physical parameter being measured. Instruments for which it is desirable to provide a data storage and interchange format include a gas chromatograph, liquid chromatograph, mass spectrometer, nuclear magnetic resonance spectrometer, etc. Also, the data model used for a receiving module will depend upon the analysis class, such as spread sheet, alarm module, statistical analysis program, or other specialized analysis module.
Each data model will require a plurality of interchange tools to couple the data to the other data models. And the number of required interchange tools rapidly becomes too great to manage, especially with the large number of scientific instruments and large number of analysis tools available to the scientist.
The difficulty in the corporate data area, such as financial data and personnel data, arises, in part from the diverse sources of such data in a modern corporation. And, for smooth functioning of management information systems, it is necessary to collect the data arising from diverse and incompatible modules into a common data base so that data can be turned into information.