1. Field of the Invention
The present invention generally relates to a database building technique, and more specifically, to a technique for retrieving data in a versatile manner from a database constructed in such a way as to include an unnormalized data structure.
2. Description of the Related Art
In a relational database (hereinafter referred to as “RDB”) which is today's dominating database, data modeling is performed by assuming that data to be processed is normalized (namely, data redundancy is eliminated).
Normalized data can be easily retrieved by using a data manipulating language (hereinafter referred to as “DML”) such as SQL. Further, many general-purpose retrieval tools have been put to practical use. However, in actual RDBs, it is difficult to achieve complete data normalization. Moreover, actual RDBs contain many unnormalized or deorganized data. Hereinbelow, conditions for normal forms according to a relation theory in RDB are shown.
(A1) Individual elements of a relation bear no relationship with one another and are atomic (first normal form condition).
(A2) Any attribute other than keys of a relation should be provided with values of all the keys when a value of the attribute is uniquely determined (second normal form condition).
(A3) When one Y of attributes X and Y of a relation is determined if the other X of the attributes is determined, the attribute X should be a key for the attribute Y (third normal form condition).
However, if these conditions are rigorously applied to a very large database, the number of necessary tables increases. Moreover, the number of joins for joining tables at a retrieving operation increases. This results in extremely reduced retrieval speed. Thus, usually, a database design permits a database to contain some unnormalized data. Hereinbelow, examples are shown wherein RDB is permitted to contain unnormalized data.
(B1) In case of retrieving data all over storage areas containing data on departments and fields, which have different data structures, respectively.
(B2) In case of hierarchically categorizing data into major, intermediate and minor classes so as to treat many kinds of data.
(B3) In case of partially accumulating data with a high frequency of use in advance so as to increase retrieval efficiency, and providing the accumulated data repeatedly.
(B4) In case of performing special processing on a small number of pieces of exceptional data according to a branch No., a flag and an identifier.
Especially, under the present conditions, the technique of employing an unnormalized data structure, which is hierarchically categorized, is heavily used as techniques of easily imparting a (non-integral) fractal dimension to the data space without impairing the whole data structure, differently from a normalized data model which handles only integral dimensions such as two-dimension and three-dimension of a data space.
Meanwhile, a logic, on which a query statement for retrieving unnormalized data is based, in the foregoing RDB has a characteristic that it is difficult to describe this logic according to a first order predicate logic assumed by ordinary DML. Practically, SQL, which is the most standard DML, has a part of multi-order logic functions such as a sub-query and HAVING clause. However, SQL has drawbacks in that these functions are weak and submit to many constraints and that logical prospects are poor. Further, the actual usage frequency of SQL is not high.
Therefore, in case of retrieving unnormalized data, it is difficult to utilize an existing general-purpose database retrieving tool which depends upon language functions attached to a database. Under the present circumstances, application programs for individual databases should be separately developed. Alternatively, primitive techniques should be employed. For example, after raw data to be processed is extracted, a user should process the raw data. Thus, there have been caused the problems that the retrieval of data requires a great deal of labor and cost and that a long processing time is required to obtain a result of the retrieval.
Further, an object-oriented database (OODB) obtained by encapsulating data and algorithms so as to be integral with one another is sometimes used to enable a local operation on data. However, even in case of employing OODB, as an amount of data is increased, a processing efficiency is reduced. Moreover, an operation of converting a data structure requires a great deal of time and effort. Thus, it is difficult to make OODB practical as a very large database.
Furthermore, in case a user directly designates complex and hard-to-understand unnormalized data stored in a database as objects to be retrieved and retrieval conditions, it is desirable that such data is represented as data of a simple data structure like a table image. Thus, in the field of OLAP (online analytical processing), an approach, by which a source data structure itself is normalized in a multidimensional space, is employed. Such an approach, however, has the problems that a revision of the existing data structure and data conversion require enormous work and that the entire structure is frequently changed owing to the necessity of exceptional data and retrievals thereof. Thus, such an approach is not effective in all situations.