Information stored in databases is often incomplete. For certain database objects whose values are not presently known to the database, possible values may be stored as disjunctive information which represents two or more alternative values. For example, if the value of an age attribute were not known, the possible values of the attribute e.g. &lt;29, 30, 31&gt; could be stored. Disjunctive information often arises in design and planning databases, which are discussed in T. Imielinski, S. Naqvi, and K. Vadaparty, "Incomplete Objects-A Data Model For Design And Planning Applications," in SIGMOD-91, pages 288-297; and T. Imielinski, S. Naqvi, and K. Vadaparty, "Querying Design And Planning Databases," in LNCS 566: DOOD-91, pages 524-545, Springer-Verlag. Disjunctive information in a database may also arise due to conflicts that occur when different databases, containing conflicting information, are merged. Choosing one possibility for each instance of disjunctive information gives a possible world described by an incomplete database.
Two types of queries on disjunctive databases have been employed. A structural query asks questions about the data stored in a database. Conceptual queries ask questions about the possible complete objects represented by a database. These two types of queries may be described by way of example. Consider a design template used by an engineer. The template may indicate that the design of some item D consists of both component A and component B, where component A can be either an x or a y. This example disjunctive database represents two complete objects. The first complete object represents the design of D consisting of x and B. The second complete object represents the design of D consisting of y and B. A structural query would ask about the configuration of the database, for example "what are the choices for component A." A conceptual query would ask about possible completed designs, for example, "how many completed designs are there". Most typically, conceptual queries are existential queries, which ask about the existence of a complete object with some quality, or optimization queries, which ask about the complete object which has some parameter optimized. An example of an existential query is, "is there a completed design that costs under $100 and has reliability at least a 95%". An example of an optimization query is "what is the most reliable design".
A mechanism for querying a database which contains disjunctive information is called normalization. A normalized database is a collection of all possible complete objects represented by a disjunctive database. The collection of elements in the normalized database is called the normal form of the database. For example, in the example above, the normal form of the database contains two elements. The first element of the normal form is the complete object representing the design of D which consists of x and B. The second element of the normal form is the complete object representing the design of D which consists of y and B. Once the normal form of a disjunctive database is produced, a conceptual query on the database can be processed by applying the query to each element of the normal form.
An algorithm to compute the normal form of a database is described in L. Libkin and L. Wong, "Semantic Representations and Query Languages for Or-Sets", in PODS-93, pages 37-48. The algorithm presented in that paper requires that the entire normal form be created before any conceptual queries could be asked. Thus, all possible complete objects represented by the disjunctive information in the database, i.e. all elements of the normal form, must be created and stored in memory prior to any queries being run on the database. The problem with this solution is that the normal form of a database may be very large. Roughly, if a database has size n, the size of the normal form of the database is bounded above by n.times.1.45.sup.n. Thus, the space required to produce the normal form of a disjunctive database increases exponentially with the size of the database. Due to the space requirement, the answering of conceptual queries using the above method is very expensive and impractical.