Database management systems (DBMS) have many important components, such as a data model, a data definition language, a data manipulation language, a query language, data access methods, a query optimizer, a concurrency and locking mechanism, etc. All of these components contribute to the desired properties of a database management system. Most of the fundamental concepts of database management system are described in any standard text book on DBMS: See, Ghosh S., Data Base Organization for Data Management, Published by Academic Press, New York (1977); Date C. J., An Introduction to Database Systems, Published by Addison-Wesley, Reading, Mass. (1977); Wiederhold G., Database Design, Published by McGraw-Hill, New York (1977).
Many DBMS product descriptions are available in the manuals provided by the different commercial vendors. DBMS products based on a relational model have been developed and an excellent summary of some of these products has been given by Kim W., "Relational Database Systems," Computing Survey, Published by ACM., Vol. 11, No. 3, pp. 185-211 (1979).
Among the components of a DBMS, the access methods and structures for implementing the methods are responsible for organizing the bits and bytes of the data on the storage media and servicing I/O requests from a host processor.
Many excellent access methods like ISAM (System 360 Operating System, Index Sequential Access Methods (Programming Logic), IBM Form Y28-6618 (1975a(out of print)) based on index sequential search, VSAM (OS/VS Virtual Storage Access Method (VSAM) Programmer's Guide, IBM Form GC26-3838 (1975b)) based on B-trees (Bayer R. and McCreight C., "Organization and Maintenance of Large Ordered Indexes", Acta. Inf. Vol. 1, No. 3, pp. 173-189 (1972)) were invented to expedite the searching of information contained in files stored in a computer. These access methods have succeeded in reducing significantly, the search time involved in retrieving information from a database.
All of these prior art access methods have been designed to expedite logical processing of information, e.g. find a record with key equal to xxx, or find all records which satisfy the predicate P(A), or update the records with attribute A having the value xxxx, etc. There are many other search techniques (Knuth D., The Art of Computer Programming, Vols. 1, 2, 3, Published by Addison-Wesley Publ. Co., Reading, Mass. (1968)), other than those which have been implemented in commercial access methods, but all of them have been designed to make the logical processing of information efficient. If the mean value of an attribute is to be calculated from the records organized by any of the existing access methods, all the records in the file have to be processed, which is very time consuming.
In general, statistical processing of information, such as the computation of the mean, is slow in systems designed for logical processing.
There are various types of statistical processing of information (Kendall M. G. & Stuart A., The Advance Theory of Statistics, Vol. 1, Published by Charles Griffin & Company, London (1958); and Kendall M. G. & Stuart A., The Advance Theory of Statistics, Vol. 2, Published by Hafner Publishing Co. New York, (1961)). Most of them have to deal with computing some numerical function based on values of many individuals, usually all the individuals (records) of the file. This makes statistics computation, time consuming. Examples of statistics computation are: estimation of parameters, curve fitting, statistical summarization (calculation of frequency distributions, moments, tabular representation, etc.), statistical testing of hypothesis, sampling, statistical design of experiments, statistical measures of associations, statistical prediction, etc. It should be noted that the final results of statistical processing of information are numbers having an accuracy or precision associated with them.
A fundamental element associated with statistical processing of information is the time needed for processing. One of the major goals of computer science is to minimize the processing time. Usually this is achieved by trading with a requirement for storage space. In statistical computation, precision is another fundamental element that can be traded to minimize time.