Data in a computing system is kept in a structured format, such as a database. A database may generically be defined as a volume of related data, (of varying types and lengths), which is kept in a defined, structured format. Generally, a database structure may be composed of a number of “entities”, each entity being arranged to hold a set of data values. Each entity commonly has a specific purpose, and is arranged to hold a specific type of data value. A computer operator who implements a database may wish to define a number of different types of entities in the database. In one example, a database may include an entity entitled “CUSTOMER”. The entity CUSTOMER may contain further entities. That is, an entity may reside within another entity. For example, the entity CUSTOMER may include entities entitled “CHILDREN”, “ADDRESSES” and “PHONENUMBERS”. The process of defining entities within entities is commonly termed “nesting”, since the entities CHILDREN, ADDRESSES and PHONENUMBERS exist (or “nest”) within the larger entity entitled CUSTOMER.
The description given above pertains to the theoretical structure of a database. While a database may be modelled as a set of nested entities in theory, in practice (e.g. on a hard drive in a computing system) the database is implemented as a set of tables with links between related elements of each table. These links serve the purpose of recreating the theoretical nesting structure. The process of taking a conceptual database structure and recreating the structure into a database on a computing system is termed “mapping”.
The computer operator designs a database as a set of nested entities, which subsequently needs to be mapped into a database. The conceptual or abstract design of a database is generally referred to as a database schema. That is, the term “database schema” refers to the organisational structure of the database. The term mapping refers to the structure of the database as it is implemented in “real life”.
In practice, databases are generally arranged in the form of a set of tables, each table being arranged to hold a set of data. In one particular embodiment, termed a “relational” database, elements in one table are linked to elements in another table. As the term implies, a relational database has links that tie together data. For example, if each customer has one address and phone number, then the data stored in the entity CUSTOMER may be stored in one database table. However, where a customer has more than one address and/or phone number, it is more efficient to hold the data values for ADDRESS in an ADDRESS table and the data values for PHONENUMBERS in a PHONENUMBER table. The ADDRESS and PHONENUMBER tables then contain appropriate links to the CUSTOMER table, such that an operator may traverse the tables in a logical manner, to find the appropriate data value in the table.
The mapping of a nested entity into a relational database results in a computer performance overhead, since, for deeply nested entities, it would be necessary to access many tables before the data value element would be retrieved.
In the normal course of running a database access application, the data in the tables will be queried to retrieve values held in the database. Database queries are also known as “read” operations. (i.e. these operations do not alter the database, they merely retrieve information from the database).
At various times, new entries will be added to a table, existing entries will be updated, and entries that are no longer needed will be deleted. These operations are collectively called “write” operations, as they modify the state of the database (i.e. they modify either the structure of the database, or the information contained within the database).
Read operations are generally much more common than write operations because it is usual to write data to the database once, and subsequently read the data many times.
When a user wishes to retrieve information from the database, it is often necessary for the database access application to traverse a number of tables before arriving at the required data value or set of data values. The aforementioned operation is known in the art as a “join” operation, as the database access application joins the data requested by the query from several disparate tables into one final result.
A join operation is generally computationally expensive. The sequence of lookups needed to complete a join operation, as outlined above, take a longer read time to complete than a single table lookup. Furthermore, deeply-nested relationships typically have very long read times since many join operations must be performed.