A database is a collection of information organized in such a way that a computer program can quickly select desired pieces of data. Traditional databases are organized by fields, records, and files. A field is a single piece of information; a record is one complete set of fields; and a file, also known as a table, is a collection of records. A database may comprise a number of tables that are linked by indices and keys, or may be a collection of objects in an object-oriented database.
For example, an employee database may comprise an address book table and a salary table. Within the address book table, each employee record may comprise information such as the employee name, employee number, birth date, address, and hiring date, and within the salary table, each employee record may comprise information such as the employee number, hiring date, hiring level, job title, and salary. The tables and objects for a given database may exist on one or more database instances.
The amount of information that a typical database holds can be astronomical, particularly with Internet-based transactions where the collection and dissemination of information is so vast. In an effort to impart structure to information collected in a database, data (i.e., information in the database) can be organized and partitioned to make databases more manageable. Typically, data is organized and partitioned by item numbers, or numerical identifiers that identify an entry in a database.
For example, in an employee database keyed (i.e., uniquely identified) by employee numbers, data (i.e., employee records) can be organized and partitioned such that employee records 1-100 reside on database instance A; employee records 101-200 reside on database instance B; and employee records 201-300 reside on database instance C, for example. As another example, in a products database keyed by a product number, data can be organized and partitioned such that item numbers 1000-1999 reside on server A; item numbers 2000-2999 reside on server B; and item numbers 3000-3999 reside on server C.
A disadvantage of this system of organization is lack of ease of manageability. A database in which data is partitioned according to a numerical scheme does not lend itself to certain database management tasks, such as strategically splitting data across machines. The task of splitting fixed-size employee records 1-10,000, for example, across 3 machines can be a simple task. However, the complexity of the task may increase when splitting variable-size product records 1-10,000 across 3 machines, since there is no efficient way of partitioning the variable-size records to facilitate database management decisions.
For example, if a database administrator decided that higher-priced products should be stored on the most expensive platform, or that certain machines should be backed-up more frequently because they store high-activity products, it could not feasibly be determined how the records could be partitioned to accommodate these splits.