Databases are used to store information. Databases are frequently organized, or modeled, based on an abstraction referred to as an “entity.” Each entity models a thing that is relatively distinct from another thing. One example is that of a retailer database, which may be organized based at least in part on a customer entity used to model customers of the retailer, a product entity used to model products carried by the retailer, a vendor entity used to model vendors of the retailer, and a store entity used to model various physical retail locations of the retailer. Each entity typically has a relationship with one or more other entities. Obvious relationships among the above entities might include a relationship between customers and products, e.g., certain customers purchased certain products, and a relationship between vendors and products, e.g., particular products are supplied by certain vendors.
The entity, such as a customer entity, also typically defines attributes which store information that describes or otherwise relates to the thing being modeled. For example, a customer entity may have a name attribute, e.g., CUST_NAME, an address attribute, e.g., CUST_ADDRESS, and the like. Attribute data that describes the actual thing being modeled is referred to as an instance of the entity. There may be many, many instances of each entity. Using the above example, the retailer database may include thousands of instances of the customer entity, each instance comprising attribute data regarding an actual customer of the retailer. Typically, each entity will define at least one attribute that uniquely identifies a particular instance of the entity from every other instance of the entity. For example, each customer may be given a unique identifier attribute, e.g., CUST_UNIQUE_IDENTIFIER, which is used to uniquely identify each customer.
There are frequently relationships among instances of different entities in a database, some of which may be relatively obvious, and some less so. For example, a particular customer instance has a relationship with a particular product instance if the customer associated with the customer instance purchased the product associated with the particular product instance. This information may be valuable to the retailer because it provides the retailer with potentially useful information about the customer, such as the fact that the customer may be a country music fan if the product is a country music CD. The particular product instance is also related to a particular store instance by virtue of the fact that the product was sold by a particular store of the retailer. Note that this latter relationship actually further establishes a relationship between the particular customer instance and the particular store instance in that the particular customer purchased the country music CD from the particular store. Knowing that the particular customer shops at a particular store may also be useful to the retailer. Consequently, it is not unusual for a database owner to want to know what relationships exist among the entity instances in a database.
Determining relationships among entity instances may be relatively simple where a database has only a few different types of entities. However, in practice, it is not unusual for a database to have hundreds of different types of entities. For relatively large databases, it can be practically impossible to determine all the relationships among entity instances due to the large potential number of all possible relationships. Also, in practice, where a known relationship exists, such as the relationship among customer entity instances, product entity instances, and store entity instances discussed above, if an individual wants a report identifying the entity instances which are so related, an individual—typically a person skilled in a particular database technology—must develop specialized software which extracts this information from the respective entity structures. This can be a time-consuming and expensive process, and requires access to individuals with specialized database skills. Accordingly, what is needed is a database relationship analyzer which can determine the relationships that exist among entity instances irrespective of the number of entities, and generate output identifying such relationships, without the need for the database owner to know in advance each potential relationship among each group of entity instances, and without the need to develop specialized database software each time such information is desired.