Database management systems (DBMSs) are widely used information resources for enterprises. One highly successful DBMS model is the relational database that is organized and accessed according to relationships between units of data in a database. The relations are typically manifested or embodied in the form of tables. Each table expresses some association between units of database content.
While database system architectures continue to evolve, many new forms of data having multiple attributes, complex internal structures, and specialized behaviors are being stored in databases. Database management tools, such as the Content Manager product available from International Business Machines Corporation, Armonk, N.Y., allow users to handle such complex data objects. The data model of current database management tools allow a user to define an index class representing an object. An index class defines the subject of a table, or group of tables. It contains a plurality of properties or attributes defined by the user. So, for example, the user can identify an index class corresponding to “Patient.” The index class Patient could include attributes: Last Name, First Name, Social Security Number, Date Admitted, and Doctor.
While the index class can accommodate multiple attributes, each attribute can be assigned only one value. This presents a problem if an attribute contains more than one value, i.e., the attribute is a multivalue attribute. For instance, in the above example, the attribute “Doctor” may contain multiple values, i.e., a patient may have more than one doctor. One approach to handling multivalue attributes is to sting the attribute values together and separate them with a separator, such as a comma or semicolon.
While this approach provides a way for associating multiple values with a single attribute, it presents new problems for data retrieval and data updating. For instance, suppose the user desired a list of all patients having a doctor named “Jones.” The DBMS would process the query and resultant set should include all patients having Doctor Jones as their sole doctor, as well as, all patients having Doctor Jones is one of several doctors. Under normal circumstances, the DBMS would perform an index search along the attribute Doctor. Such a search, however, would miss patients having Jones as one of several doctors because the string of characters (Jones) is included with other characters (names of other doctors) for the attribute. If wildcard indicators are used to try to isolate the string of characters, “Jones,” then the resultant set might also include undesired doctor names, such as “Joneson.”
Thus, an index search would not produce the desired resultant set. In the alternative, a table scan would need to be performed. Such a search is very costly in time and database resources.
Accordingly, a need exists for a system and method for supporting multivalue attributes in a database system, while maintaining high performance. The present invention addresses such a need.