1. Field of the Invention
This invention relates in general to database management systems performed by computers, and in particular, to a set containment join operation performed in an object/relational database management system.
2. Description of Related Art
(Note: This application references a number of different publications as indicated throughout the specification by reference numbers enclosed in brackets, e.g., [x]. A list of these different publications ordered according to these reference numbers can be found in the xe2x80x9cDetailed Description of the Preferred Embodimentxe2x80x9d in Section 6 entitled xe2x80x9cReferences.xe2x80x9d Each of these publications is incorporated by reference herein.)
The data modeling community has long realized that set-valued attributes provide a concise and natural way of modeling complex data [RKS88]. Recently, there has been a resurgence of interest in set-valued attributes from two different perspectives. First, commercial O/R DBMS""s (Object/Relational DataBase Management Systems) are beginning to support set-valued attributes, which is likely to lead to their use in xe2x80x9crealxe2x80x9d applications. Second, the rise of XML (eXtensible Markup Language) as an important data standard increases the need for set-valued attributes, since it appears that set-valued attributes are key for the natural representation of XML data in O/R DBMS""s [JAI99].
Unfortunately, although sets have been fairly well studied from a data-modeling viewpoint, very little has been published about the efficient implementation of operations on set-valued attributes. Thus, there is a need in the art for improved operations over set-valued attributes, and in particular, there is a need in the art for a set containment join operation.
The present invention discloses a method, apparatus, and article of manufacture for performing a novel partition-based set containment join algorithm, known as Set Partitioning Algorithm (SPA). The SPA is performed by a relational database management system to aggressively partition set-valued attributes into a very large number of partitions, in order to minimize the impact of excessive replication and improve performance.